# Introduction to `sns_modeling` tool

The separation network synthesis (sns) tool provides an initial design for distillation separation systems using thermally coupled distillation columns. The intent of the tool is to determine cost optimal separation network structure for process synthesis and intensification applications. This notebook provides an overview of how to install and use the package to build and solve separation network synthesis problems.

The main goals of the tool are to aid a designer in determining:

1. Selection of component splits in a separation network
2. Process topology with an emphasis on heat exchanger
3. Initial sizing and costing of unit operations

## Table of Contents:

1. [Package Setup and Installation](#1-package-setup-and-installation)
2. [Package Structure](##2.-package-structure)
3. [Problem Setup and Data Loading](##3.problem-setup-and-data-loading)
4. [Problem Scaling and Transformation](##4.problem-scaling-and-transformation)
5. [Problem Solution](##5.problem-solution)
6. [Solution Output and Interpretation](##6.solution-output-and-interpretation)

## 1. Package Setup and Installation

Download or clone the package from the Github repo [sns_modeling](https://github.com/pfauk/sns_modeling). In a terminal, navigate to the directory location and install the package dependencies by running: 

```
pip install -r requirements.txt
```

**Important**: this package requires an installed version of Gurobi to solve the MIQCP model. It is possible to pip install Gurobi from the Python Package Index (PyPI). However, the free version comes with a trial license that will only be able to solve models of a smaller size (2,000 variables or constraints).

After installing dependencies, navigate to the directory of the package and run:

```
pip install -e . 
```

## 2. Package Structure

The package has several main components that contribute to the overall functionality of the design tool. The main components are:

1. Model generation
2. Superstrucutre generation
3. Data directory
4. Utilities
5. Documentation

**1. Model generation**

The Pyomo Concrete Model object is generated through a function call to the 'build_model' function in `src\thermal_coupled\therm_dist.py`. All of the mathematical modeling components are defined and documented in this script. This function is the core functionality of the package.

**2. Superstructure generation**

An important aspect of process synthesis modeling is defining a process superstructure. This package includes functionality to automatically generate a network superstructure and pass it to the build model function. The `src\superstructure` directory contains the functionality to build the state task network (stn) for the superstructure of the problem.

**3. Data directory**

Data for this problem should be structured in the format of the provided exampled excel spreadsheets in the `src\data`. Users do not have to place data files in that directory, but it is the default location for the utility functions to pull data from. 

**4. Utilities**

The `utils.py` script contains functionality for loading data from excel spreadsheets into objects that can be passed to the `build_model function`. There are utility functions for printing and saving models and the resulting solutions.

**5. Documentation**

The `docs` directory contains files that give a detailed explanation of the mathematical model, empirical correlations that were used, and superstructure definition.

## 3. Problem Setup and Data Loading

Here we will show how to use the modeling tool's core functionality to build and solve a process synthesis problem. The general problem that we want to solve is: 

*Given an N component zeotropic mixture, determine the cost optimal separation sequence, column design, and heat integration to separate the mixture into N components*.

This can be generally visualized as separating some mixture of components {A, B, C, D} into streams of relatively high purity.


<img src="images/problem_statement.png" alt="Problem Statement" width="550" height="250" />

*Representation of the conceptual problem of separating a 4 component mixture into 4 high purity product streams*

The workflow for this design problem can be outlined as:

1. Define and construct a process superstructure
2. Define all relevant species and system data
3. Build the generalized disjunctive program (GDP)
4. Transform the GDP into a mixed integer quadratically constrained program (MIQCP)
5. Solve the mathematical program

### 3 a. Process Superstructure

We first need to represent the overall superstructure of a distillation process. The superstructure represents the solution space to the problem, so the representation has to be sufficiently detailed to provide a meaningful solution representation. We include functionality to automatically build superstructures for an arbitrary number of components in a mixture. There are two different types of distillation superstructures we can build, those with separation splits between **consecutive key components** and those with splits between **non-consecutive key components**.

For a single distillation column, we specify light key and heavy components that we want to separate out in high purity in the distillate and bottoms respectively. This is referred to as the split of the components in a mixture. There are two options for how to specify the key components. We can specify a mixture of arbitrary components that are ordered by decreasing relative volatility as: *ABC*.

For the case of splits between consecutive key components, there are two possible split options for this mixture: *A/BC* and *AB/C*. In the example of the split *A/BC*, *B* is the light key component and *C* is the heavy key component.

For the case of splits between non-consecutive key components, there are three possible split options for this mixture: *A/BC*, *AB/BC* *AB/C*. In the example of the split *AB/BC*, *A* is the light key component and *C* is the heavy key component, and *B* is an intermediate component that is allowed to distribute between the distillate and the bottoms of a distillation column.

We will now show the first step of the modeling approach, which is build the problem structure as a state-task network (STN).

In [2]:
## Imports
import logging
import os
import pyomo.environ as pyo
from pyomo.util.infeasible import log_infeasible_constraints, find_infeasible_constraints
from pyomo.util.model_size import build_model_size_report
from idaes.core.util.model_statistics import report_statistics
from utils import (
    Data,
    get_model_type,
    pprint_network,
    pprint_tasks,
    save_model_to_file,
    save_solution_to_file,
    get_model_type,
    print_constraint_type)
from superstructure.stn import stn
from superstructure.stn_nonconsecutive import stn_nonconsecutive
from thermal_coupled.therm_dist import build_model

In [4]:
# specify number of components in the feed to the system
n = 4

We choose the example of a 4 component system feed and a superstructure that uses splits only between consecutive key components.

In [5]:
# build state-task network superstrucutre and associated index sets
network_superstructure = stn(n)
network_superstructure.generate_tree()
network_superstructure.generate_index_sets()

This state-task network superstrucutre can be visually represented as a graph

<img src="images/consecutive_split_stn.png" alt="4 Component STN" width="925" height="550">

*The state-task network superstrucutre for a 4 component mixture with splits between consecutive key components*

In [7]:
# the instantiated network_superstructure graph does include text display
network_superstructure.print_tree()

State(ABCD)
  Task(A/BCD)
    State(A, final=True)
    State(BCD)
      Task(B/CD)
        State(B, final=True)
        State(CD)
          Task(C/D)
            State(C, final=True)
            State(D, final=True)
      Task(BC/D)
        State(BC)
          Task(B/C)
            State(B, final=True)
            State(C, final=True)
        State(D, final=True)
  Task(AB/CD)
    State(AB)
      Task(A/B)
        State(A, final=True)
        State(B, final=True)
    State(CD)
      Task(C/D)
        State(C, final=True)
        State(D, final=True)
  Task(ABC/D)
    State(ABC)
      Task(A/BC)
        State(A, final=True)
        State(BC)
          Task(B/C)
            State(B, final=True)
            State(C, final=True)
      Task(AB/C)
        State(AB)
          Task(A/B)
            State(A, final=True)
            State(B, final=True)
        State(C, final=True)
    State(D, final=True)


There is the alternative use of a superstructure that uses splits between non-consecutive key components. An example for such a network with a feed of 4 components is shown below. Note that this network structure has more nodes and more connections, which will result in a more difficult optimization problem.

<img src="images/nonconsecutive_split_stn.png" alt="4 Component STN" width="950" height="500">

*The state-task network superstrucutre for a 4 component mixture with splits between non-consecutive key components*

### 3 b. Data

After building a superstructure, a user needs to provide parameters to define a specific problem instance. Data needs to be provided for both the overall system and for the species present in the feed mixture. The `Data` class from the tool utilities is provided to parse data from a spreadsheet and construct an object to pass to the `build_model` function. Example data spreadsheet files are included and users should utilize this same formatting for new problem instances. The `src/data` directory is the default location for the `Data` class to located files, but alternative file locations can be provided as a key word argument.

In [10]:
data_file_name = '4_comp_alkanes.xlsx'

# import problem data for system and relevant species to data object
mixture_data = Data(data_file_name)

**Species parameters**

This sheet contains parameters for each chemical species in the system. Species index should just be upper case capital letters: A, B, C, D. Species should be ordered by decreasing relative volatility, with A as the most volatile and the last species as the least volatile.

Inlet fractions are mole fractions. Enure they sum to 1.

Relative volatilities can be determine from ASPEN properties for binary mixtures relative to the least volatile species in the system. The relative volatility ($\alpha_i$) should be found in ASPEN for the temperature and pressure specified in the system sheet. The system is modeled so that the feeds are liquids at bubble point. The liquid density of the species at the system temperature and pressure is used to do empirical correlations for equipment sizing.

The recovery ($x_i$) for each component is the fraction of the molar flow in the outlet of that species stream, relative to the total inlet to the system. Note that a product recovery constraint is different than a product purity constraint. Setting the recoveries too high (such as setting all the recoveries to values of 1) may lead to an infeasible problem, as such a separation could require columns with an infinite number of stages. 


![](./images/species_params.png)

In [13]:
# print out the system data 
print('System data')
print('================================================================')
print(mixture_data.system_df)

System data
   F0 [kmol/hr]  Pressure [bar]  Temp [C]  Cost cooling [$/kJ]  \
0           250               1        85               0.0015   

   Cost heating [$/kJ]  
0                0.005  


In [14]:
# print out the feed species data 
print('Mixture species data')
print('================================================================')
print(mixture_data.species_df)

Mixture species data
          Species index  Inlet Mole Frac  Relative Volatility  \
0   C6 (n-hexane)     A             0.30                 3.56   
1  C7 (n-heptane)     B             0.30                 2.29   
2    C8 (n-octane     C             0.36                 1.50   
3   C9 (n-nonane)     D             0.04                 1.00   

   Liquid Density [kg/m^3]  Molecular Weight  \
0                      731               142   
1                      741               156   
2                      750               170   
3                      754               184   

   Enthalpy of Vaporization [kJ/mol]  Recovery  
0                              46.58      0.95  
1                              51.57      0.95  
2                              55.97      0.95  
3                              60.56      0.95  


)

## 4. Problem Scaling and Transformation

We now build the actual mathematical model as a Pyomo Concrete Model using the `build_model` function. This function will return both a Pyomo model and a dictionary of scaling factors. The key word ard `scale` allows the users to apply pre-defined scaling factors that scale down cost coefficients in the model. The down-scaling of the model helps to reduce the size of the feasible space for the problem and aid in speeding up computational solution time. For small problem instances, scaling does not provide as much computational benefit, but as the problem to be solved grows in hte number of components to separate, scaling becomes increasingly important to obtain a solution in a reasonable time window. 

In [1]:
# building Pyomo model
model, scaling_factors = build_model(network_superstructure, mixture_data, scale=True)

It can sometimes be useful to inspect the model object prior to solution. The model that is constructed by the `build_model` function is a generalized disjunctive program (GDP). The `get_model_type` from `utils.py` allows you to see what type of mathematical model the Pyomo model object contains. Furthermore, the `save_model_to_file` function can be used to create a text file to inspect the entire model object in a pretty printed format. Saving the model for inspection is best done prior to transforming of the GDP model. By default, the pretty printed Pyomo model is saved to `thermal_coupled\saved_models`.

In [6]:
# saving the pyomo model to a file
save_model_to_file(model, '4_comp_model')

In [7]:
# check the model type prior to transformation
print(f'Model type before transformation: {get_model_type(model)}')

Finally, we need to transform the generalized disjunctive program to a mixed integer program to apply solves to the model. This is done using Pyomo.gdp's `TransformationFactory` as shown below. The GDP model also contains logical constraints that need to be transformed prior to passing the model to a solver.

In [None]:
# use of Pyomo.gdp to apply Big-M transformation
pyo.TransformationFactory('core.logical_to_linear').apply_to(model)

mbigm = pyo.TransformationFactory('gdp.bigm')
mbigm.apply_to(model)

print(f'Model type after transformation: {get_model_type(model)}')

## 5. Problem Solution

After applying the Big-M transformation using Pyomo.GDP's TransformationFactory, the Pyomo model is a non-convex mixed-integer quadratically constrained program (MIQCP). We can use Gurobi to solve the model. Gurobi has a number of parameters that can be passed to the solver. A [full list of solver parameters](https://www.gurobi.com/documentation/current/refman/parameters.html) can be founds on the Gurobi website. For now we recommend setting the NumericFocus and nonConvex parameters to values of 2. Experience with the problem has also shown that a reduced MIPGap value can aid in solution time without negatively impacting the quality of the resulting solution. Gurobi's default MIPGap setting is $1e-4$. For more complex problems, a large amount of time can be spent by the solver in 

In [8]:
# Pyomo solver factory
solver = pyo.SolverFactory('gurobi')

# Gurobi solver options
solver.options = {'NumericFocus': 2,
                  'nonConvex': 2,
                  'MIPgap': 1e-3}

Now send the Pyomo model to the solver. The logging setup helps to troubleshoot any infeasible constraints that might exist in the model. It is not uncommon to have some infeasible log statements as a result of some of the transformation variables.

In [None]:
results = solver.solve(model, tee=True)

To see the output of the solution, just use the `print_network` function from the package utilities

In [10]:
pprint_network(model)

You can save the pretty printed solution output to a text file with the used of `save_solution_to_file`. The `thermal_coupled\results` directory is the default save location.

In [11]:
save_solution_to_file(model, '4_comp_solution_1')

References:
