Skip to content

Commit

Permalink
Use DataFrames to linearize indices (#407)
Browse files Browse the repository at this point in the history
  • Loading branch information
abelsiqueira committed Jan 31, 2024
1 parent cf7bf9b commit ce82e1e
Show file tree
Hide file tree
Showing 7 changed files with 570 additions and 234 deletions.
9 changes: 7 additions & 2 deletions benchmark/benchmarks.jl
Original file line number Diff line number Diff line change
Expand Up @@ -26,10 +26,15 @@ const OUTPUT_FOLDER_BM = mktempdir()
# end
# constraints_partitions = compute_constraints_partitions(graph, representative_periods)

# SUITE["direct_usage"]["construct_dataframes"] = @benchmarkable begin
# construct_dataframes($graph, $representative_periods, $constraints_partitions)
# end
# dataframes = construct_dataframes(graph, representative_periods, constraints_partitions)
#
# SUITE["direct_usage"]["create_model"] = @benchmarkable begin
# create_model($graph, $representative_periods, $constraints_partitions)
# create_model($graph, $representative_periods, $dataframes)
# end
# model = create_model(graph, representative_periods, constraints_partitions)
# model = create_model(graph, representative_periods, dataframes)

# SUITE["direct_usage"]["solve_model"] = @benchmarkable begin
# solve_model($model)
Expand Down
19 changes: 14 additions & 5 deletions benchmark/profiling.jl
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,12 @@ input_dir = mktempdir()
for file in readdir(NORSE_PATH; join = false)
cp(joinpath(NORSE_PATH, file), joinpath(input_dir, file))
end
# Add another line to rep-periods-data.csv
# Add another line to rep-periods-data.csv and rep-periods-mapping.csv
open(joinpath(input_dir, "rep-periods-data.csv"), "a") do io
println(io, "3,1,$new_rp_length,0.1")
println(io, "3,$new_rp_length,0.1")
end
open(joinpath(input_dir, "rep-periods-mapping.csv"), "a") do io
println(io, "216,3,1")
end
# Add profiles to flow and asset
open(joinpath(input_dir, "flows-profiles.csv"), "a") do io
Expand Down Expand Up @@ -46,6 +49,12 @@ end

#%%

@time model = create_model(graph, representative_periods, constraints_partitions);
@benchmark create_model($graph, $representative_periods, $constraints_partitions)
# @profview create_model(graph, representative_periods, constraints_partitions);
@time dataframes = construct_dataframes(graph, representative_periods, constraints_partitions)
@benchmark construct_dataframes($graph, $representative_periods, $constraints_partitions)
# @profview construct_dataframes($graph, $representative_periods, $constraints_partitions)

#%%

@time model = create_model(graph, representative_periods, dataframes);
@benchmark create_model($graph, $representative_periods, $dataframes)
# @profview create_model(graph, representative_periods, dataframes);
125 changes: 108 additions & 17 deletions docs/src/tutorial.md
Original file line number Diff line number Diff line change
Expand Up @@ -96,10 +96,16 @@ constraints_partitions = compute_constraints_partitions(graph, representative_pe

The `constraints_partitions` has two dictionaries with the keys `:lowest_resolution` and `:highest_resolution`. The lowest resolution dictionary is mainly used to create the constraints for energy balance, whereas the highest resolution dictionary is mainly used to create the capacity constraints in the model.

Finally, we also need dataframes that store the linearized indexes of the variables.

```@example manual
dataframes = construct_dataframes(graph, representative_periods, constraints_partitions)
```

Now we can compute the model.

```@example manual
model = create_model(graph, representative_periods, constraints_partitions)
model = create_model(graph, representative_periods, dataframes)
```

Finally, we can compute the solution.
Expand All @@ -108,6 +114,12 @@ Finally, we can compute the solution.
solution = solve_model(model)
```

or, if we want to store the `flow` and `storage_level` optimal value in the dataframes:

```@example manual
solution = solve_model!(dataframes, model)
```

This `solution` structure is exactly the same as the one returned when using an `EnergyProblem`.

### Change optimizer and specify parameters
Expand Down Expand Up @@ -268,21 +280,33 @@ To create a traditional array in the order given by the investable flows, one ca
[solution.flows_investment[(u, v)] for (u, v) in edge_labels(graph) if graph[u, v].investable]
```

The `solution.flow` and `solution.storage_level` values are linearized according to the dataframes in the dictionary `energy_problem.dataframes` with keys `:flows` and `:storage_level`, respectively.
You need to query the data from these dataframes and then use the column `index` to select the appropriate value.

To create a vector with all values of `flow` for a given `(u, v)` and `rp`, one can run

```@example solution
(u, v) = first(edge_labels(graph))
rp = 1
[solution.flow[(u, v), rp, B] for B in graph[u, v].partitions[rp]]
df = filter(
row -> row.rp == rp && row.from == u && row.to == v,
energy_problem.dataframes[:flows],
view = true,
)
[solution.flow[row.index] for row in eachrow(df)]
```

To create a vector with the all values of `storage_level` for a given `a` and `rp`, one can run

```@example solution
a = first(labels(graph))
a = energy_problem.dataframes[:storage_level].asset[1]
rp = 1
cons_parts = energy_problem.constraints_partitions[:lowest_resolution]
[solution.storage_level[a, rp, B] for B in cons_parts[(a, rp)]]
df = filter(
row -> row.asset == a && row.rp == rp,
energy_problem.dataframes[:storage_level],
view = true,
)
[solution.storage_level[row.index] for row in eachrow(df)]
```

> **Note**
Expand All @@ -305,39 +329,106 @@ They can be accessed like any other value from [GraphAssetData](@ref) or [GraphF
```@example solution
(u, v) = first(edge_labels(graph))
rp = 1
[energy_problem.graph[u, v].flow[(rp, B)] for B in graph[u, v].partitions[rp]]
df = filter(
row -> row.rp == rp && row.from == u && row.to == v,
energy_problem.dataframes[:flows],
view = true,
)
[energy_problem.graph[u, v].flow[(rp, row.time_block)] for row in eachrow(df)]
```

To create a vector with the all values of `storage_level` for a given `a` and `rp`, one can run

```@example solution
a = first(labels(graph))
a = energy_problem.dataframes[:storage_level].asset[1]
rp = 1
df = filter(
row -> row.asset == a && row.rp == rp,
energy_problem.dataframes[:storage_level],
view = true,
)
[energy_problem.graph[a].storage_level[(rp, row.time_block)] for row in eachrow(df)]
```

### The solution inside the dataframes object

In addition to being stored in the `solution` object, and in the `graph` object, the solution for the flow and the storage_level is also stored inside the corresponding DataFrame objects if `solve_model!` is called.

The code below will do the same as in the two previous examples:

```@example solution
(u, v) = first(edge_labels(graph))
rp = 1
df = filter(
row -> row.rp == rp && row.from == u && row.to == v,
energy_problem.dataframes[:flows],
view = true,
)
df.solution
```

```@example solution
a = energy_problem.dataframes[:storage_level].asset[1]
rp = 1
cons_parts = energy_problem.constraints_partitions[:lowest_resolution]
[energy_problem.graph[a].storage_level[(rp, B)] for B in cons_parts[(a, rp)]]
df = filter(
row -> row.asset == a && row.rp == rp,
energy_problem.dataframes[:storage_level],
view = true,
)
df.solution
```

### Values of constraints and expressions

By accessing the model directly, we can query the values of constraints and expressions.
For instance, we can get all incoming flow in the lowest resolution for a given asset at a given time block for a given representative periods with the following:
We need to know the name of the constraint and how it is indexed, and for that you will need to check the model.

For instance, we can get all incoming flow in the lowest resolution for a given asset for a given representative periods with the following:

```@example solution
using JuMP
# a, rp, and cons_parts are defined above
B = cons_parts[(a, rp)][1]
value(energy_problem.model[:incoming_flow_lowest_resolution][a, rp, B])
# a and rp are defined above
df = filter(
row -> row.asset == a && row.rp == rp,
energy_problem.dataframes[:cons_lowest],
view = true,
)
[value(energy_problem.model[:incoming_flow_lowest_resolution][row.index]) for row in eachrow(df)];
```

The values of constraints can also be obtained, however they are frequently indexed in a subset, which means that their indexing is not straightforward.
To know how they are indexed, it is necessary to look at the code of the model.
For instance, to get the consumer balance, we first need to filter the `:cons_lowest` dataframes by consumers:

```@example solution
df_consumers = filter(
row -> graph[row.asset].type == "consumer",
energy_problem.dataframes[:cons_lowest],
view = false,
);
nothing # hide
```

We set `view = false` to create a copy of this DataFrame, so we can create our indexes:

```@example solution
df_consumers.index = 1:size(df_consumers, 1) # overwrites existing index
```

The same can happen for constraints.
For instance, the code below gets the consumer balance:
Now we can filter this DataFrame.

```@example solution
a = "Asgard_E_demand"
B = cons_parts[(a, rp)][1]
value(energy_problem.model[:consumer_balance][a, rp, B])
df = filter(
row -> row.asset == a && row.rp == rp,
df_consumers,
view = true,
)
value.(energy_problem.model[:consumer_balance][df.index]);
```

Here `value.` (i.e., broadcasting) was used instead of the vector comprehension from previous examples just to show that it also works.

The value of the constraint is obtained by looking only at the part with variables. So a constraint like `2x + 3y - 1 <= 4` would return the value of `2x + 3y`.

### Writing the output to CSV
Expand Down
Loading

0 comments on commit ce82e1e

Please sign in to comment.