# 5.3 Set operations

In [1]:
# install dependencies
%pip install -q amplpy

from amplpy import AMPL, ampl_notebook

ampl = ampl_notebook(
    modules=['highs'],  # modules to install
    license_uuid='default',  # license to use
)  # instantiate AMPL object and register magics

[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.2[0m[39;49m -> [0m[32;49m25.3[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython3 -m pip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


VBox(children=(Output(), HBox(children=(Text(value='', description='License UUID:', style=TextStyle(descriptioâ€¦

AMPL has four operators that construct new sets from existing ones:
```
A   union B     union: in either A or B
A   inter B     intersection: in both A and B
A   diff B      difference: in A but not B
A   symdiff B   symmetric difference: in A or B but not both
```
The following excerpt from an AMPL session shows how these work:

In [2]:
ampl.eval("""
    set Y1 = 2010 .. 2040 by 5;
    set Y2 = 2020 .. 2045 by 5;
""")
ampl.display('Y1 union Y2, Y1 inter Y2')
ampl.display('Y1 diff Y2, Y1 symdiff Y2')

set Y1 union Y2 := 2010 2015 2020 2025 2030 2035 2040 2045;

set Y1 inter Y2 := 2020 2025 2030 2035 2040;

set Y1 diff Y2 := 2010 2015;

set Y1 symdiff Y2 := 2010 2015 2045;



The operands of set operators may be other set expressions, allowing more complex
expressions to be built up:

In [3]:
ampl.display('Y1 symdiff (Y1 symdiff Y2)')
ampl.display('Y1 union {2045,2055,2065} diff Y2')
ampl.display('2020..2060 by 5 symdiff (Y1 union Y2)')

set Y1 symdiff (Y1 symdiff Y2) := 2020 2025 2030 2035 2040 2045;

set Y1 union  {2045, 2055, 2065} diff Y2 := 2010 2015 2055 2065;

set 2020 .. 2060 by 5 symdiff (Y1 union Y2) := 2050 2055 2060 2010 2015;



The operands must always represent sets, however, so that for example you must write
`Y1 union {2025}`, not `Y1 union 2025`.

Set operators group to the left unless parentheses are used to indicate otherwise. The
`union`, `diff`, and `symdiff` operators have the same precedence, just below that of
`inter`. Thus, for example,
```
A union B inter C diff D
```
is parsed as
```
(A union (B inter C)) diff D
```
A precedence hierarchy of all AMPL operators is given in Table A-1 of Section A.4 xTODO.

Set operations are often used in the assignment phrase of a set declaration, to define a
new set in terms of already declared sets. A simple example is provided by a variation on
the diet model of [Figure 2-1](../02/2_2_an_AMPL_model_for_the_diet_problem.ipynb#fig-2-1). Rather than specifying a lower limit and an upper limit on
the amount of every nutrient, suppose that you want to specify a set of nutrients that have
a lower limit, and a set of nutrients that have an upper limit. (Every nutrient is in one set
or the other; some nutrients might be in both.) You could declare:
```
set MINREQ;  # nutrients with minimum requirements
set MAXREQ;  # nutrients with maximum requirements
set NUTR;    # all nutrients (DUBIOUS)
```
But then you would be relying on the user of the model to make sure that `NUTR` contains
exactly all the members of `MINREQ` and `MAXREQ`. At best this is unnecessary work, and
at worst it will be done incorrectly. Instead you can define `NUTR` as the union:
```
set NUTR = MINREQ union MAXREQ;
```

All three of these sets are needed, since the nutrient minima and maxima are indexed over
`MINREQ` and `MAXREQ`,
```
param n_min {MINREQ} >= 0;
param n_max {MAXREQ} >= 0;
```
while the amounts of nutrients in the foods are indexed over `NUTR`:
```
param amt {NUTR,FOOD} >= 0;
```
The modification of the rest of the model is straightforward; the result is shown in [Figure 5-1](../05/5_3_set_operations.ipynb#fig-5-1).

As a general principle, it is a bad idea to set up a model so that redundant information
has to be provided. Instead a minimal necessary collection of sets should be chosen to be
supplied in the data, while other relevant sets are defined by expressions in the model.

<a id='fig-5-1'><center><b>Figure 5-1:</b> Figure 5-1: Diet model using union operator (dietu.mod).</center></a>

In [4]:
%%writefile dietu.mod

set MINREQ;                     # nutrients with minimum requirements
set MAXREQ;                     # nutrients with maximum requirements
set NUTR = MINREQ union MAXREQ; # nutrients
set FOOD;                       # foods
param cost {FOOD} > 0;
param f_min {FOOD} >= 0;
param f_max {j in FOOD} >= f_min[j];
param n_min {MINREQ} >= 0;
param n_max {MAXREQ} >= 0;
param amt {NUTR,FOOD} >= 0;
var Buy {j in FOOD} >= f_min[j], <= f_max[j];
minimize Total_Cost: sum {j in FOOD} cost[j] * Buy[j];
subject to Diet_Min {i in MINREQ}:
 sum {j in FOOD} amt[i,j] * Buy[j] >= n_min[i];
subject to Diet_Max {i in MAXREQ}:
 sum {j in FOOD} amt[i,j] * Buy[j] <= n_max[i];

Overwriting dietu.mod


To specify the data we are omitting some values for the `n_min` and `n_max` parameters for those nutrients being only in `MINREQ` (so they don't necessarily need a `n_max` value), or only in `MAXREQ`. There are several ways to convey this kind of data in Python: one option is to use Pandas Dataframes with empty values. When the dataframe is being transfered to AMPL, we can drop the empty values:
```
n_max_df = df_nutr['n_max'].dropna()
ampl.param['n_max'] = n_max_df
```

Another option to specify sparse data is through key-value dictionaries, where the key is the nutrient, and the value the corresponding `n_min` or `n_max` value. For example, 
```
n_max_dict = {'A' : 20000, 'NA' : 50000, 'CAL' : 24000}
ampl.param['n_max'] = n_max_dict
```

<a id='fig-5-2'><center><b>Figure 5-2:</b> Figure 5-2: Data for diet model with unions.</center></a>

In [5]:
import pandas as pd

MINREQ = ['A', 'B1', 'B2', 'C', 'CAL']
MAXREQ = ['A', 'NA', 'CAL']

df_food = pd.DataFrame(
    [
        ['BEEF', 3.19, 2, 10],
        ['CHK', 2.59, 2, 10],
        ['FISH', 2.29, 2, 10],
        ['HAM', 2.89, 2, 10],
        ['MCH', 1.89, 2, 10],
        ['MTL', 1.99, 2, 10],
        ['SPG', 1.99, 2, 10],
        ['TUR', 2.49, 2, 10]
    ],
    columns=['FOOD', 'cost', 'f_min', 'f_max']
).set_index('FOOD')

df_nutr = pd.DataFrame(
    [
        ['A', 700, 20000],
        ['C', 700, None],
        ['B1', 0, None],
        ['B2', 0, None],
        ['NA', None, 50000],
        ['CAL', 16000, 24000],
    ],
    columns=['NUTR', 'n_min', 'n_max']
).set_index('NUTR')

df_amt = pd.DataFrame(
    [
        ['BEEF', 60, 20, 10, 15, 938, 295],
        ['CHK', 8, 0, 20, 20, 2180, 770],
        ['FISH', 8, 10, 15, 10, 945, 440],
        ['HAM', 40, 40, 35, 10, 278, 430],
        ['MCH', 15, 35, 15, 15, 1182, 315],
        ['MTL', 70, 30, 15, 15, 896, 400],
        ['SPG', 25, 50, 25, 15, 1329, 370],
        ['TUR', 60, 20, 15, 10, 1397, 450]
    ],
    columns=['FOOD', 'A', 'C', 'B1', 'B2', 'NA', 'CAL']
).set_index('FOOD').T

display(df_food)
display(df_nutr)
display(df_amt)

Unnamed: 0_level_0,cost,f_min,f_max
FOOD,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
BEEF,3.19,2,10
CHK,2.59,2,10
FISH,2.29,2,10
HAM,2.89,2,10
MCH,1.89,2,10
MTL,1.99,2,10
SPG,1.99,2,10
TUR,2.49,2,10


Unnamed: 0_level_0,n_min,n_max
NUTR,Unnamed: 1_level_1,Unnamed: 2_level_1
A,700.0,20000.0
C,700.0,
B1,0.0,
B2,0.0,
,,50000.0
CAL,16000.0,24000.0


FOOD,BEEF,CHK,FISH,HAM,MCH,MTL,SPG,TUR
A,60,8,8,40,15,70,25,60
C,20,0,10,40,35,30,50,20
B1,10,20,15,35,15,15,25,15
B2,15,20,10,10,15,15,15,10
,938,2180,945,278,1182,896,1329,1397
CAL,295,770,440,430,315,400,370,450


In [6]:
# Clear previous data
ampl.reset()

# Load dietu model
ampl.read('dietu.mod')

ampl.set['MINREQ'] = MINREQ
ampl.set['MAXREQ'] = MAXREQ

# Send data
ampl.param['n_min'].set_values(df_nutr['n_min'].dropna())
ampl.param['n_max'].set_values(df_nutr['n_max'].dropna())
ampl.set_data(df_food, 'FOOD')
ampl.param['amt'] = df_amt

# Solve problem
ampl.solve(solver='highs')
ampl.display('Buy')

HiGHS 1.11.0: optimal solution; objective 74.27382022
2 simplex iterations
0 barrier iterations
Buy [*] :=
BEEF   2
 CHK  10
FISH   2
 HAM   2
 MCH   2
 MTL   6.23596
 SPG   5.25843
 TUR   2
;

