## Optimal Power Flow
**Power Systems Optimization**

by Michael R. Davidson and Jesse D. Jenkins (last updated: October 4, 2020)

This notebook provides an introductory glimpse of the Optimal Power Flow (OPF) problem&mdash;which minimizes the short-run production costs of meeting electricity demand at a number of connected locations from a given set of generators subject to various technical and flow limit constraints. This will be our first treatment of a **network**, which is critical to all power systems.

We will first introduce a model of transmission flows that assumes we can control the flow along each path, in what is called a "transport model". This is a straightforward extension to Economic Dispatch (ED), where we have multiple supply and demand balance constraints, and a new set of flow constraints. This is also similar to other common optimization problems such as fleet routing of shipments.

We will then introduce a linear approximation to the optimal power flow problem known as "DC-OPF", where we begin to incorporate some of the physics involved in how electricity flows along transmission lines. With this, we recognize that given "injections" (i.e., generation) and withdrawals (i.e., demand) of power, flows along lines are not independently controllable. This can (very frequently) result in hitting flow constraints before we would if could control them as in the transport problem.

Our model does not explore the full functionality of DC-OPF, which can include inter-temporal constraints, additional generation constraints (e.g., on voltage), security constraints to ensure stability in the case of contingencies, and network losses.

Full "AC optimal power flow" models are also beyond the scope of this notebook, as the full set of physics associated with the interactions of AC flows introduces non-linearities that are much harder to solve.

We will start off with some simple systems, whose solutions can be worked out manually without resorting to any mathematical optimization model and software. But, eventually we will be solving larger system, thereby emphasizing the importance of such software and mathematical models.

## Introduction to OPF


Optimal Power Flow (OPF) is a power system optimal scheduling problem which captures the physics of electricity flows, adding a layer of complexity and more realism to the Economic Dispatch (ED) problem. OPF usually attempts to capture the entire network topology by representing the transmission line interconnections between different nodes (also known as buses) including various electrical parameters, such as resistance, series reactance, shunt admittance, etc. The full "AC" OPF turns out to be an extremely hard problem to solve (usually NP-hard). Hence, system operators and power marketers usually go about solving a linearized version of it, called the DC-OPF. The DC-OPF approximation works satisfactorily for bulk power transmission networks as long as such networks are not operated at the brink of instability or under heavily loaded conditions.

## "Transport" model
We will first examine the case where we allow for transmission but ignore the physics of electricity flows, and instead treat it like transporting an ordinary commodity.

$$
\begin{align}
\min \ & \sum_{g \in G} VarCost_g \times GEN_g & \\
\text{s.t.} & \\
 & \sum_{g \in G_i} GEN_g - Demand_i = \sum_{j \in J_i} FLOW_{ij} & \forall \quad i \in \mathcal{N}\\
 & FLOW_{ij} \leq MaxFlow_{ij} & \forall \quad i \in \mathcal{N}, \forall j \in J_i \\
 & FLOW_{ji} = - FLOW_{ij} & \forall \quad i, j \in \mathcal{N} \\
 & GEN_g \leq Pmax_g & \forall \quad g \in G \\
 & GEN_g \geq Pmin_g & \forall \quad g \in G  
\end{align}
$$

We introduce a few new **sets** in the above:
- $\mathcal{N}$, the set of all nodes (or buses) in the network
- $J_i$, the subset of nodes that are connected to node $i$
- $G_i \subset G$, the subset of generators located at node $i$
 
The **decision variables** in the above problem are:

- $GEN_{g}$, the generation (in MW) produced by each generator, $g$
- $FLOW_{ij}$, the flow (in MW) along the line from $i$ to $j$

The **parameters** are:

- $Pmin_g$, the minimum operating bounds for the generator (based on engineering or natural resource constraints)
- $Pmax_g$, the maximum operating bounds for the generator (based on engineering or natural resource constraints)
- $Demand_i$, the demand (in MW) at bus $i$
- $MaxFlow_{ij}$, the maximum allowable flow along the line from $i$ to $j$
- $VarCost_g$, the variable cost of generator $g$

Notice how the problem above is equivalent to producing a single type of good at a set of factories and shipping them along capacity-limited corridors (roads, rail lines, etc.) to meet a set of demands in other locations. 

### 1. Load packages

In [1]:
using JuMP, GLPK
using Plots; plotly();
using VegaLite  # to make some nice plots
using DataFrames, CSV, PrettyTables
ENV["COLUMNS"]=120; # Set so all columns of DataFrames and Matrices are displayed

┌ Info: For saving to png with the Plotly backend ORCA has to be installed.
└ @ Plots /Users/michd/.julia/packages/Plots/ViMfq/src/backends.jl:373


### 2. Load and format data

We will load a modified 3-bus case stored in the [MATPOWER case format](https://matpower.org/docs/ref/matpower5.0/caseformat.html). It consists of:

- two generator buses with variable costs 50 and 100
- one load bus
- three lines connecting the buses

The location and numbering of the components:

<img src="img/opf_network.png" style="width: 450px; height: auto" align="left">

In [2]:
datadir = joinpath("opf_data") 
gen = CSV.read(joinpath(datadir,"gen.csv"), DataFrame);
gencost = CSV.read(joinpath(datadir,"gencost.csv"), DataFrame);
branch = CSV.read(joinpath(datadir,"branch.csv"), DataFrame);
bus = CSV.read(joinpath(datadir,"bus.csv"), DataFrame);

# Rename all columns to lowercase (by convention)
for f in [gen, gencost, branch, bus]
    rename!(f,lowercase.(names(f)))
end

# create generator ids 
gen.id = 1:nrow(gen);
gencost.id = 1:nrow(gencost);

# create line ids 
branch.id = 1:nrow(branch);
# add set of rows for reverse direction with same parameters
branch2 = copy(branch)
branch2.f = branch2.fbus
branch2.fbus = branch.tbus
branch2.tbus = branch2.f
branch2 = branch2[:,names(branch)]
append!(branch,branch2)

# Here are the buses:
bus

Unnamed: 0_level_0,bus_i,type,pd,qd,gs,bs,area,vm,va,basekv,zone,vmax,vmin
Unnamed: 0_level_1,Int64,Int64,Int64,Float64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Float64,Float64
1,1,2,0,0.0,0,0,1,1,0,230,1,1.1,0.9
2,2,2,0,0.0,0,0,1,1,0,230,1,1.1,0.9
3,3,1,600,98.61,0,0,1,1,0,230,1,1.1,0.9


Columns pd and qd indicate the active and reactive power withdrawal at the bus. (We will ignore qd for this notebook.) We do not need any of other columns for our purposes.

In [3]:
# This is what the generator dataset looks like:
gen

Unnamed: 0_level_0,bus,pg,qg,qmax,qmin,vg,mbase,status,pmax,pmin,pc1,pc2,qc1min,qc1max,qc2min,qc2max
Unnamed: 0_level_1,Int64,Int64,Int64,Float64,Float64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64
1,1,40,0,30.0,-30.0,1,100,1,1000,0,0,0,0,0,0,0
2,2,170,0,127.5,-127.5,1,100,1,1000,0,0,0,0,0,0,0


In [4]:
# and generator cost dataset:
gencost

Unnamed: 0_level_0,model,startup,shutdown,n,x1,y1,id
Unnamed: 0_level_1,Int64,Int64,Int64,Int64,Int64,Int64,Int64
1,2,0,0,2,50,0,1
2,2,0,0,2,100,0,2


In the above, model=2 indicates a polynomial variable cost formulation and the column n=2 indicates that there are two terms. Thus, we have a linear cost, ignoring any quadratic terms (and a zero constant term):

$$
VarCost_g = x1_g
$$

In [5]:
# Here are the transmission lines:
branch

Unnamed: 0_level_0,fbus,tbus,r,x,b,ratea,rateb,ratec,ratio,angle,status,angmin,angmax,id
Unnamed: 0_level_1,Int64,Int64,Float64,Float64,Float64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64
1,1,3,0.00281,0.0281,0.00712,500,500,500,0,0,1,-360,360,1
2,1,2,0.00281,0.0281,0.00712,500,500,500,0,0,1,-360,360,2
3,2,3,0.00281,0.0281,0.00712,500,500,500,0,0,1,-360,360,3
4,3,1,0.00281,0.0281,0.00712,500,500,500,0,0,1,-360,360,1
5,2,1,0.00281,0.0281,0.00712,500,500,500,0,0,1,-360,360,2
6,3,2,0.00281,0.0281,0.00712,500,500,500,0,0,1,-360,360,3


For the first model, we are only using transmission line capacity (known as "ratings"), given in `ratea`, `rateb`, and `ratec`. These correspond to different ratings based on how long the line might be overloaded, with `ratec` known as an "emergency rating", which could exceed the long-term rating, `ratea`. We will use `ratea` for this model. The dataset also contains resistance and reactance.

### 3. Create solver function (transport)

In [6]:
#=
Function to solve transport flow problem 
Inputs:
    gen -- dataframe with generator info
    branch -- dataframe with transmission lines info
    gencost -- dataframe with generator info
    bus -- dataframe with bus types and loads
Note: it is always a good idea to include a comment blog describing your
function's inputs clearly!
=#
function transport(gen, branch, gencost, bus)
    Transport = Model(GLPK.Optimizer) # You could use Clp as well, with Clp.Optimizer
    
    # Define sets
      # Set of all generators
    G = gen.id
      # Set of all nodes
    N = bus.bus_i
    
      # sets J_i and G_i will be described using dataframe indexing below

    # Decision variables   
    @variables(Transport, begin
        GEN[G]  >= 0     # generation        
        # Note: we assume Pmin = 0 for all resources for simplicty here
        FLOW[N,N]        # flow
        # Note: flow is not constrained to be positive
        # By convention, positive will indicate flow from the first to second node
        #  and a negative flow will indicate flow from the second to the first
        # This matrix is thus "anti-symmetric", which we will ensure with an appropriate
        #  constraint
    end)
                
    # Objective function
    @objective(Transport, Min, 
        sum( gencost[g,:x1] * GEN[g] 
                        for g in G)
    )

    # Supply demand balances
    @constraint(Transport, cBalance[i in N], 
        sum(GEN[g] for g in gen[gen.bus .== i,:id]) 
                - bus[bus.bus_i .== i,:pd][1] ==
        sum(FLOW[i,j] for j in branch[branch.tbus .== i,:fbus]))

    # Max generation constraint
    @constraint(Transport, cMaxGen[g in G],
                    GEN[g] <= gen[g,:pmax])

    # Flow constraints
    for l in 1:nrow(branch)
        @constraint(Transport, 
            FLOW[branch[l,:fbus][1],branch[l,:tbus][1]] <= 
                        branch[l,:ratea])
    end
    
    @constraint(Transport, cFlowSymmetric[i in N, j in N],
                    FLOW[j,i] == -FLOW[i,j])

    # Solve statement (! indicates runs in place)
    optimize!(Transport)

    # Dataframe of optimal decision variables
    generation = DataFrame(
        id = gen.id,
        node = gen.bus,
        gen = value.(GEN).data
        )
    
    flows = value.(FLOW).data

    # Return the solution and objective as named tuple
    return (
        generation = generation, 
        flows,
        cost = objective_value(Transport)
    )
end

transport (generic function with 1 method)

### 4. Solve

In [7]:
solution = transport(gen, branch, gencost, bus)
solution.generation

Unnamed: 0_level_0,id,node,gen
Unnamed: 0_level_1,Int64,Int64,Float64
1,1,1,600.0
2,2,2,0.0


We generate all 600 MW from Gen A at Bus 1.

In [8]:
DataFrame(solution.flows)

Unnamed: 0_level_0,x1,x2,x3
Unnamed: 0_level_1,Float64,Float64,Float64
1,0.0,100.0,500.0
2,-100.0,0.0,100.0
3,-500.0,-100.0,0.0


In turn, the following flows are created: 

- $l_{13}$ = 500 MW
- $l_{12}$ = 100 MW
- $l_{23}$ = 100 MW

Hence, we are able to maximize the capacity of the line from 1 to 3 ($l_{13}$), and then route the remaining power through Bus 2.

## DC linear approximation

The above model is not physically correct as we cannot arbitrarily route power through lines. We will now introduce a linear approximation to the optimal power flow problem that incorporates this limitation and is tractable and reasonably accurate.

In the DC linear approximation, flows along the line from bus i to bus j are driven by voltage phase angle differences:

$$
FLOW_{ij} = B_{ij}(\theta_j-\theta_i)
$$

Modifying the above transport model to incorporate these new decision variables and constraints:

$$
\begin{align}
\min \ & \sum_{g \in G} VarCost_g \times GEN_g & \\
\text{s.t.} & \\
 & \sum_{g \in G_i} GEN_g - Demand_i = \sum_{j\in J(i)}B_{ij}(\theta_j-\theta_i) & \forall \quad i \in \mathcal{N}\\
 & B_{ij}(\theta_j-\theta_i) \leq MaxFlow_{ij} & \forall \quad i \in \mathcal{N}, \forall j \in J_i \\
 & GEN_g \leq Pmax_g & \forall \quad g \in G \\
 & GEN_g \geq Pmin_g & \forall \quad g \in G  
\end{align}
$$

Note that we no longer require a constraint to enforce anti-symmetric flows.

We have the following **sets**:
- $\mathcal{N}$, the set of all nodes (or buses) in the network
- $J_i$, the subset of nodes that are connected to node $i$
- $G_i \subset G$, the subset of generators located at node $i$
 
The **decision variables** in the above problem are:

- $GEN_{g}$, the generation (in MW) produced by each generator, $g$
- $\theta_i$, the voltage phase angle at bus $i$ relative to the reference bus (THETA_i in the function below)

The **parameters** are:

- $Pmin_g$, the minimum operating bounds for the generator (based on engineering or natural resource constraints)
- $Pmax_g$, the maximum operating bounds for the generator (based on engineering or natural resource constraints)
- $Demand_i$, the demand (in MW) at bus $i$
- $MaxFlow_{ij}$, the maximum allowable flow along the line from $i$ to $j$
- $VarCost_g$, the variable cost of generator $g$
- $B_{ij}$, susceptance for line connecting buses $i$ and $j$



### 3. Create solver function (dcopf)

In [49]:
#=
Function to solve DC OPF problem 
Inputs:
    gen -- dataframe with generator info
    branch -- dataframe with transmission lines info
    gencost -- dataframe with generator info
    bus -- dataframe with bus types and loads
Note: it is always a good idea to include a comment blog describing your
function's inputs clearly!
=#
function dcopf(gen, branch, gencost, bus)
    DCOPF = Model(GLPK.Optimizer) # You could use Clp as well, with Clp.Optimizer
    
    # Define sets
      # Set of all generators
    G = gen.id
      # Set of all nodes
    N = bus.bus_i
    
    # Define conversion of impendance from p.u. to SI units
    # It is common for power systems data to be in "per units" (p.u.) units,
    #  with respect to some given base voltage level (which could vary by line).
    # In the current problem, the whole network is assumed to be at the same voltage,
    #  given in bus.basekv . Power is given in MW (i.e., 10^6 W).
    # To convert impedance, we use the formula: Z = V^2 / P
    pu_base = (bus.basekv[1]*1e3)^2 / 1e6
    
      # sets J_i and G_i will be described using dataframe indexing below

    # Decision variables   
    @variables(DCOPF, begin
        GEN[G]  >= 0     # generation        
        # Note: we assume Pmin = 0 for all resources for simplicty here
        THETA[N]         # voltage phase angle of bus
    end)
    
    # Create slack bus with reference angle = 0
    # Note: by convention this is a generator bus. Hence, we will select bus 1
    fix(THETA[1],0)
                
    # Objective function
    @objective(DCOPF, Min, 
        sum( gencost[g,:x1] * GEN[g] 
                        for g in G)
    )
    
    # Supply demand balances
    #  Note: we multiply here by pu_convert
    @constraint(DCOPF, cBalance[i in N], 
        sum(GEN[g] for g in gen[gen.bus .== i,:id]) 
                - bus[bus.bus_i .== i,:pd][1] ==
        sum(branch[(branch.fbus .== i) .& (branch.tbus .== j),:b][1] *
            pu_base * 
            (THETA[j] - THETA[i])
            for j in branch[branch.fbus .== i,:tbus]))

    # Max generation constraint
    @constraint(DCOPF, cMaxGen[g in G],
                    GEN[g] <= gen[g,:pmax])

    # Flow constraints
    #  Note: we multiply here by pu_convert    
    for l in 1:nrow(branch)
        @constraint(DCOPF,
            branch[l,:b] * pu_base *   
            (THETA[branch[l,:tbus][1]] - THETA[branch[l,:fbus][1]]) <= 
                        branch[l,:ratea])
    end

    # Solve statement (! indicates runs in place)
    optimize!(DCOPF)

    # Output variables
    generation = DataFrame(
        id = gen.id,
        node = gen.bus,
        gen = value.(GEN).data
        )
    
    angles = value.(THETA).data
    
    flows = DataFrame(
        fbus = branch.fbus,
        tbus = branch.tbus,
        flow = pu_base .* branch.b .* (angles[branch.tbus] .- 
                        angles[branch.fbus]))
    
    # We output the marginal values of the demand constraints, 
    # which will in fact be the prices to deliver power at a given bus.
    prices = DataFrame(
        node = bus.bus_i,
        value = dual.(cBalance).data)
    
    # Return the solution and objective as named tuple
    return (
        generation = generation, 
        angles,
        flows,
        prices,
        cost = objective_value(DCOPF)
    )
end

dcopf (generic function with 1 method)

### 4. Solve

In [55]:
solution = dcopf(gen, branch, gencost, bus)
solution.generation

Unnamed: 0_level_0,id,node,gen
Unnamed: 0_level_1,Int64,Int64,Float64
1,1,1,600.0
2,2,2,0.0


Hence, we generate all 600 MW from Gen A at Bus 1.

In [56]:
# These are the voltage phase angles of the buses relative to Bus 1.
solution.angles

3-element Array{Float64,1}:
 0.0
 0.5309997663601028
 1.0619995327202056

In [57]:
solution.flows

Unnamed: 0_level_0,fbus,tbus,flow
Unnamed: 0_level_1,Int64,Int64,Float64
1,1,3,400.0
2,1,2,200.0
3,2,3,200.0
4,3,1,-400.0
5,2,1,-200.0
6,3,2,-200.0


Thus, we notice that, in contrast to the transport model, we do not max out the capacity of $l_{13}$. The following flows are created: 

- $l_{13}$ = 400 MW
- $l_{12}$ = 200 MW
- $l_{23}$ = 200 MW


### 5. Solve high demand case

Now, let's increase demand at Bus 3 to 800 MW. Intuition tells us we will no longer be able to generate all of our power from Gen A alone.

In [58]:
bus_high = copy(bus)
bus_high[3,:pd] = 800

sol_high = dcopf(gen, branch, gencost, bus_high)
sol_high.generation

Unnamed: 0_level_0,id,node,gen
Unnamed: 0_level_1,Int64,Int64,Float64
1,1,1,700.0
2,2,2,100.0


This situation is explained by flow patterns, where the capacity of $l_{13}$ is at its maximum, but in order to meet demand at Bus 3, more power needs to be injected in Bus 2.

In [59]:
sol_high.flows

Unnamed: 0_level_0,fbus,tbus,flow
Unnamed: 0_level_1,Int64,Int64,Float64
1,1,3,500.0
2,1,2,200.0
3,2,3,300.0
4,3,1,-500.0
5,2,1,-200.0
6,3,2,-300.0


The following flows are created: 

- $l_{13}$ = 500 MW
- $l_{12}$ = 200 MW
- $l_{23}$ = 300 MW


### 6. Compare prices

The marginal values of the demand constraints at a given bus represent the change in the objective that results from increasing demand at the bus by one unit. This is the natural definition of a "value" of power at that location, and is the basis for **locational marginal prices** (LMPs) found in electricity markets.

We examine first the regular case of demand = 600 MW, then the high demand case = 800 MW.

In [61]:
solution.prices

Unnamed: 0_level_0,node,value
Unnamed: 0_level_1,Int64,Float64
1,1,50.0
2,2,50.0
3,3,50.0


All prices are the same in this case. The interpretation: if we were to add an incremental load at any of the buses, we could meet it from additional production from Gen A which has marginal cost of \$50 / MWh. We are not going to hit any transmission limits.

In [60]:
sol_high.prices

Unnamed: 0_level_0,node,value
Unnamed: 0_level_1,Int64,Float64
1,1,50.0
2,2,100.0
3,3,150.0


Something interesting has happened! 

First, note that the prices are different. Hence, we will not be able to meet incremental load from production by Gen A (except if we add load right at Gen A located at Bus 1). Similarly, load at Bus 2 can be met by increasing production from Gen B with marginal cost = \$100 / MWh.

However, why does Bus 3 have a marginal price of \$150 / MWh?!

The answer lies in what must happen to meet an incremental load at Bus 3 while respecting transmission constraints. We must increase from Gen B, but in doing so part of the power from Gen B will go through 2->1->3 in addition to 2->3. However, this will cause us to exceed the transmission constraint on line $l_{13}$, requiring us to throttle back power from Gen A.

The exact change for an incremental 1 MW load at Bus 3:
- Gen B $\uparrow$ 2 MW
- Gen A $\downarrow$ 1 MW

Hence:

$$
Price_3 = 2 \times VarCost_B - VarCost_A = \$150 \text{ / MWh}
$$

In a network with thousands of nodes, one can see quite quickly how prices may vary in unexpected ways; hence, the need for detailed mathematical models.