<a href="https://colab.research.google.com/github/EvenSol/NeqSim-Colab/blob/master/notebooks/process/Machine_learning_and_process_simulation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Machine learning and process simulation
Machine learning and process simulation are two powerful tools that can be used together to solve complex problems in various domains. Here's an overview of how they can be combined:

1. Data Generation: Machine learning models require a large amount of data to learn and make accurate predictions. Process simulation can be used to generate synthetic data that mimics the behavior of a real-world process. This data can then be used to train machine learning models.

2. Feature Extraction: Process simulation models often generate vast amounts of data, including various process variables and measurements. Machine learning techniques can be employed to extract relevant features from this data. These features can capture important patterns and relationships within the process, which can then be used as inputs for machine learning algorithms.

3. Model Training and Optimization: Machine learning models can be trained using the data generated by process simulations. This training can be used to develop predictive models that can estimate process variables or predict outcomes based on certain inputs. The machine learning models can be optimized using techniques like cross-validation, hyperparameter tuning, and model selection to improve their accuracy and performance.

4. Process Optimization and Control: Once trained, machine learning models can be integrated into process simulations to optimize and control the process. The models can be used to identify optimal process settings, predict failures or anomalies, and suggest corrective actions. By combining machine learning with process simulation, it becomes possible to optimize complex processes and improve efficiency.

5. Uncertainty Analysis: Process simulations often involve uncertainties due to various factors such as input variability, model assumptions, and measurement errors. Machine learning techniques, such as Bayesian inference and Monte Carlo simulation, can be used to quantify and propagate uncertainties through the model. This provides a more comprehensive understanding of the process behavior and helps make more informed decisions.

6. Model Validation: Machine learning models trained using process simulation data should be validated against real-world data to ensure their accuracy and generalizability. This involves comparing the model's predictions with actual measurements from the process. Any discrepancies can be used to refine the model or adjust simulation parameters.

#Use of NeqSim in combination with ML
In this notebook we will use NeqSim for generating synthtic data for Machine Learning algorithms. We will use https://scikit-learn.org/stable/index.html
for doing Machine Learning in Python 

In [1]:
!pip install neqsim
!pip install wget

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting neqsim
  Downloading neqsim-2.4.15-py3-none-any.whl (27.9 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m27.9/27.9 MB[0m [31m10.2 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting jpype1 (from neqsim)
  Downloading JPype1-1.4.1-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (465 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m465.3/465.3 kB[0m [31m11.3 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: jpype1, neqsim
Successfully installed jpype1-1.4.1 neqsim-2.4.15
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting wget
  Downloading wget-3.2.zip (10 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Building wheels for collected packages: wget
  Building wheel for wget (setup.py) ... [?25l[?25hdone
  Created wheel for wget: filename=wget-3.2-py3-non

In [2]:
import pandas as pd
import wget

#Setting up a neqsim model for a TEG dehydration process


In [3]:
url = 'https://raw.githubusercontent.com/EvenSol/NeqSim-Colab/master/data/TEGprocessData.csv'
filename = wget.download(url)

try:
  TEGprocessDataFrame = pd.read_csv('./TEGprocessData.csv')
except:
  print('error loading data')

TEGprocessDataFrame.head(10)

Unnamed: 0,feedGasFlowRate,feedGasTemperature,feedGasPressure,absorberFeedGasTemperature,absorberFeedGasPressure,leanTEGFlowRate,leanTEGTemperature,flashDrumPressure,reboilerPressure,condenserPressure,condenserTemperature,reboilerTemperature,strippingGasRate,strippingGasFeedTemperature,bufferTankTemperatureTEG,hotTEGpumpPressure,regenerationGasCoolerTemperature
0,4,25,70,35,139,6100,48.5,4.8,1.2,1.2,100,200,180,80,90.5,3.0,50
1,5,25,70,35,139,6100,48.5,4.8,1.2,1.2,100,200,180,80,90.5,3.0,50
2,5,30,70,35,115,5000,40.0,4.8,1.2,1.15,100,200,180,80,91.0,3.0,50
3,6,30,70,35,135,7500,40.0,4.8,1.2,1.15,100,200,180,80,90.5,3.0,50
4,10,25,70,35,135,8000,40.0,4.8,1.2,1.15,100,200,180,80,90.5,3.5,50
5,5,25,70,35,135,6100,45.0,4.8,1.1,1.1,110,200,180,80,90.5,3.0,50
6,5,20,50,35,120,6500,40.0,4.0,1.2,1.15,110,200,180,70,90.5,3.0,50
7,5,20,50,35,120,6500,40.0,4.0,1.2,1.15,110,200,150,80,90.5,3.0,50
8,5,20,50,35,120,6500,40.0,4.0,1.2,1.15,110,170,150,80,90.5,3.1,50
9,7,30,55,35,120,7000,45.0,4.0,1.2,1.15,110,203,170,90,90.5,3.0,50


In [4]:
from neqsim.thermo import fluid, printFrame
from neqsim.process import getProcess, clearProcess, mixer, heater, stream, pump, separator, runProcess, stream, saturator, valve, filters, heatExchanger, simpleTEGAbsorber,distillationColumn, waterStripperColumn, recycle2, setpoint, calculator

def getTEGProcess(inputData):
  clearProcess()

  # Start by creating a fluid in neqsim
  feedGas = fluid("cpa")  # create a fluid using the SRK-Eo
  feedGas.addComponent("nitrogen", 0.245);
  feedGas.addComponent("CO2", 3.4);
  feedGas.addComponent("methane", 85.7);
  feedGas.addComponent("ethane", 5.981);
  feedGas.addComponent("propane", 2.743);
  feedGas.addComponent("i-butane", 0.37);
  feedGas.addComponent("n-butane", 0.77);
  feedGas.addComponent("i-pentane", 0.142);
  feedGas.addComponent("n-pentane", 0.166);
  feedGas.addComponent("n-hexane", 0.06);
  feedGas.addComponent("benzene", 0.01);
  feedGas.addComponent("water", 0.0);
  feedGas.addComponent("TEG", 0);
  feedGas.setMixingRule(10)
  feedGas.setMultiPhaseCheck(False)
  feedGas.init(0)

  dryFeedGas = stream(feedGas)
  dryFeedGas.setName('dry feed gas')
  dryFeedGas.setFlowRate(inputData['feedGasFlowRate'], 'MSm3/day')
  dryFeedGas.setTemperature(inputData['feedGasTemperature'], 'C')
  dryFeedGas.setPressure(inputData['feedGasPressure'], 'bara')

  saturatedFeedGas = saturator(dryFeedGas)
  saturatedFeedGas.setName("water saturator")

  waterSaturatedFeedGas = stream(saturatedFeedGas.getOutStream())
  waterSaturatedFeedGas.setName("water saturated feed gas")

  feedTEG = feedGas.clone()
  feedTEG.setMolarComposition([0.0,0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,0.015, 0.985])

  feedTPsetterToAbsorber = heater(waterSaturatedFeedGas)
  feedTPsetterToAbsorber.setName('TP of gas to absorber')
  feedTPsetterToAbsorber.setOutPressure(inputData['absorberFeedGasPressure'], "bara")
  feedTPsetterToAbsorber.setOutTemperature(inputData['absorberFeedGasTemperature'], "C")

  feedToAbsorber = stream(feedTPsetterToAbsorber.getOutStream())
  feedToAbsorber.setName("feed to TEG absorber")

  TEGFeed = stream(feedTEG)
  TEGFeed.setName('lean TEG to absorber')
  TEGFeed.setFlowRate(inputData['leanTEGFlowRate'], 'kg/hr')
  TEGFeed.setTemperature(inputData['leanTEGTemperature'], 'C')
  TEGFeed.setPressure(inputData['absorberFeedGasPressure'], 'bara')

  absorber = simpleTEGAbsorber()
  absorber.setName("TEG absorber")
  absorber.addGasInStream(feedToAbsorber)
  absorber.addSolventInStream(TEGFeed)
  absorber.setNumberOfStages(4)
  absorber.setStageEfficiency(0.7)

  dehydratedGas = stream(absorber.getGasOutStream())
  dehydratedGas.setName('dry gas from absorber')

  richTEG = stream(absorber.getSolventOutStream())
  richTEG.setName("rich TEG from absorber")

  glycol_flash_valve = valve(richTEG)
  glycol_flash_valve.setName("Rich TEG HP flash valve")
  glycol_flash_valve.setOutletPressure(inputData['flashDrumPressure'])

  richGLycolHeaterCondenser = heater(glycol_flash_valve.getOutStream())
  richGLycolHeaterCondenser.setName("rich TEG preheater")

  heatEx2 = heatExchanger(richGLycolHeaterCondenser.getOutStream())
  heatEx2.setName("cold lean/rich TEG heat-exchanger")
  heatEx2.setGuessOutTemperature(273.15 + 60.0)
  heatEx2.setUAvalue(2224.0)

  flashSep = separator(heatEx2.getOutStream(0))
  flashSep.setName("degasing separator")

  flashGas = stream(flashSep.getGasOutStream())
  flashGas.setName("gas from degasing separator")

  flashLiquid = stream(flashSep.getLiquidOutStream())
  flashLiquid.setName("liquid from degasing separator")

  fineFilter = filters(flashLiquid)
  fineFilter.setName("TEG fine filter")
  fineFilter.setDeltaP(0.001, "bara")

  heatEx = heatExchanger(fineFilter.getOutStream())
  heatEx.setName("lean/rich TEG heat-exchanger")
  heatEx.setGuessOutTemperature(273.15 + 130.0)
  heatEx.setUAvalue(8316.0)

  glycol_flash_valve2 = valve(heatEx.getOutStream(0))
  glycol_flash_valve2.setName("Rich TEG LP flash valve")
  glycol_flash_valve2.setOutletPressure(inputData['reboilerPressure'])

  stripGas = feedGas.clone()

  strippingGas = stream(stripGas)
  strippingGas.setName('stripGas')
  strippingGas.setFlowRate(inputData['strippingGasRate'], "Sm3/hr")
  strippingGas.setTemperature(inputData['strippingGasFeedTemperature'], "C")
  strippingGas.setPressure(inputData['reboilerPressure'], "bara")

  gasToReboiler = strippingGas.clone()
  gasToReboiler.setName("gas to reboiler")

  column = distillationColumn(1, True, True)
  column.setName("TEG regeneration column")
  column.addFeedStream(glycol_flash_valve2.getOutStream(), 1)
  column.getReboiler().setOutTemperature(273.15 + inputData['reboilerTemperature'])
  column.getCondenser().setOutTemperature(273.15 + inputData['condenserTemperature'])
  column.getTray(1).addStream(gasToReboiler)
  column.setTopPressure(inputData['condenserPressure'])
  column.setBottomPressure(inputData['reboilerPressure'])

  coolerRegenGas = heater(column.getGasOutStream())
  coolerRegenGas.setName("regen gas cooler")
  coolerRegenGas.setOutTemperature(273.15 + inputData['regenerationGasCoolerTemperature'])

  sepregenGas = separator(coolerRegenGas.getOutStream())
  sepregenGas.setName("regen gas separator");

  gasToFlare = stream(sepregenGas.getGasOutStream())
  gasToFlare.setName("gas to flare");

  liquidToTrreatment = stream(sepregenGas.getLiquidOutStream())
  liquidToTrreatment.setName("water to treatment")

  stripper = waterStripperColumn("TEG stripper")
  stripper.addSolventInStream(column.getLiquidOutStream())
  stripper.addGasInStream(strippingGas)
  stripper.setNumberOfStages(4)
  stripper.setStageEfficiency(0.68)

  recycleGasFromStripper = recycle2("stripping gas recirc")
  recycleGasFromStripper.addStream(stripper.getGasOutStream())
  recycleGasFromStripper.setOutletStream(gasToReboiler)

  heatEx.setFeedStream(1, stripper.getSolventOutStream())

  bufferTank = heater(heatEx.getOutStream(1))
  bufferTank.setName("TEG buffer tank")
  bufferTank.setOutTemperature(273.15 + inputData['bufferTankTemperatureTEG'])

  hotLeanTEGPump = pump(bufferTank.getOutStream(),inputData['hotTEGpumpPressure'],"lean TEG LP pump")

  heatEx2.setFeedStream(1, hotLeanTEGPump.getOutStream())

  coolerhOTteg3 = heater(heatEx2.getOutStream(1))
  coolerhOTteg3.setName("lean TEG cooler")
  coolerhOTteg3.setOutTemperature(273.15 + inputData['leanTEGTemperature'])

  hotLeanTEGPump2 = pump(coolerhOTteg3.getOutStream(), inputData['absorberFeedGasPressure'], "lean TEG HP pump")
  hotLeanTEGPump2.setName("lean TEG HP pump")
  hotLeanTEGPump2.setOutletPressure(inputData['absorberFeedGasPressure'])

  pumpHPPresSet = setpoint("HP pump set", hotLeanTEGPump2, "pressure", feedToAbsorber)

  leanTEGtoabs = stream(hotLeanTEGPump2.getOutStream())
  leanTEGtoabs.setName("lean TEG to absorber")

  pureTEG = feedGas.clone()
  pureTEG.setMolarComposition([0.0,0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0])

  makeupTEG = stream(pureTEG)
  makeupTEG.setName("makeup TEG")
  makeupTEG.setFlowRate(1e-6, "kg/hr")
  makeupTEG.setTemperature(inputData['leanTEGTemperature'], "C")
  makeupTEG.setPressure(inputData['absorberFeedGasPressure'], "bara")

  makeupCalculator = calculator("TEG makeup calculator")
  makeupCalculator.addInputVariable(dehydratedGas)
  makeupCalculator.addInputVariable(flashGas)
  makeupCalculator.addInputVariable(gasToFlare)
  makeupCalculator.addInputVariable(liquidToTrreatment)
  makeupCalculator.setOutputVariable(makeupTEG)

  makeupMixer = mixer("makeup mixer")
  makeupMixer.addStream(leanTEGtoabs)
  makeupMixer.addStream(makeupTEG)

  resycleLeanTEG = recycle2("lean TEG resycle")
  resycleLeanTEG.addStream(makeupMixer.getOutStream())
  resycleLeanTEG.setOutletStream(TEGFeed)
  resycleLeanTEG.setPriority(200)
  resycleLeanTEG.setDownstreamProperty("flow rate")

  richGLycolHeaterCondenser.setEnergyStream(column.getCondenser().getEnergyStream())

  TEGprocess = getProcess()
  return TEGprocess.copy()

# Start simulation

In [5]:
processes = []

for i in range(len(TEGprocessDataFrame)):
  inputData = TEGprocessDataFrame.loc[i]
  process = getTEGProcess(inputData)
  process.runAsThread()
  processes.append(process)

# Collect syntectic data from NeqSim
The simulation will need to finish before running the code to collect the results for the synthetic data and run the machine learning models. This calculation time can be significant (up to 1 hour).

In [56]:
drygaswaterppm = []
reboilerdutykW = []
gasToFLareRatekghr = []

for i in range(len(TEGprocessDataFrame)):
  drygaswaterppm.append(processes[i].getUnit("dry gas from absorber").getFluid().getPhase(0).getComponent('water').getz()*1.0e6)
  reboilerdutykW.append(processes[i].getUnit("TEG regeneration column").getReboiler().getDuty()/1.0e3)
  gasToFLareRatekghr.append(processes[i].getUnit("gas to flare").getFlowRate("kg/hr"))

dict = {
        'drygasppm': drygaswaterppm,
        'reboilerdutykW': reboilerdutykW,
        'gasToFLareRatekghr': gasToFLareRatekghr
        } 

resultsDF = pd.DataFrame(dict)
print(resultsDF)

     drygasppm  reboilerdutykW  gasToFLareRatekghr
0     3.808609      242.224036          196.628351
1     4.097538      254.899782          196.462821
2     4.618307      220.261069          194.033076
3     5.255173      373.270383          206.025929
4     6.474771      431.837932          208.848433
5     3.408914      257.074737          199.166452
6     4.604282      276.189160          200.678788
7     5.447505      275.467026          173.385493
8    18.900043      190.234948          173.070190
9     5.902153      392.200835          192.364801
10    5.989675      306.459227          195.352502
11    5.904907      357.370876          196.782634
12    5.244608      267.817104          169.020340
13    8.225064      458.211977          179.099317
14    6.257876      180.263615          155.975076
15    7.979425       97.562682          151.046124
16   22.695616      274.731309          153.845125
17    7.979425       67.841585          166.616833
18   22.910978       58.372706 

#Setting up the Machine Learning model
Split the data set into training and testing data.

In [57]:
# split a dataset into train and test sets
import numpy as np
from sklearn.ensemble import AdaBoostRegressor
from sklearn.tree import DecisionTreeRegressor
from sklearn.metrics import mean_squared_error
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(TEGprocessDataFrame.to_numpy(), resultsDF, test_size=0.2)
print(X_train.shape, X_test.shape, y_train.shape, y_test.shape)

regr_1 = DecisionTreeRegressor(max_depth=6)
regr_1.fit(X_train, y_train)

score = regr_1.score(X_train, y_train)
print("train R-squared:", score) 

scoreTest = regr_1.score(X_test, y_test)
print("test R-squared:", scoreTest) 

(34, 17) (9, 17) (34, 3) (9, 3)
train R-squared: 0.9789802117092932
test R-squared: 0.5946169580018167


In [58]:
ytest = regr_1.predict(X_test)
print('input \n', y_test)
print('trained model \n', ytest)

testerror = mean_squared_error(ytest, y_test)
print("test error :", testerror) 

input 
     drygasppm  reboilerdutykW  gasToFLareRatekghr
31  12.577958      264.161471          157.699488
32   7.556837      192.268507          197.284201
41  45.811824      174.903271          170.058458
39  12.493555      286.730165          157.048547
4    6.474771      431.837932          208.848433
35   7.556837      189.095201          162.694480
24   5.142578      167.966124          156.240319
33   8.376569      127.440398          179.234819
6    4.604282      276.189160          200.678788
trained model 
 [[ 12.67059531 264.24739443 160.97991572]
 [  6.25787625 180.26361501 155.97507594]
 [ 39.12010909 173.82618181 171.5143957 ]
 [ 12.67059531 264.24739443 160.97991572]
 [  5.90490677 357.37087606 196.78263441]
 [ 12.67059531 264.24739443 160.97991572]
 [ 12.67059531 264.24739443 160.97991572]
 [  8.98062597 108.93992081 150.80345717]
 [  4.28967488 253.91582202 196.3731178 ]]
test error : 919.0804364732918


#Use the model for prediciting the proces data
In the following code we test the generated model for å given input

In [59]:
inputData1 = [{
  "feedGasFlowRate": 6.0, #MSm3/day
  "feedGasTemperature": 25.0, #C
  "feedGasPressure":70.0, #bara
  "absorberFeedGasTemperature": 35.0, #C
  "absorberFeedGasPressure": 139.0, #bara
  "leanTEGFlowRate": 6100.0, #kg/hr
  "leanTEGTemperature": 48.5, #C
  "flashDrumPressure": 4.8, #bara
  'reboilerPressure': 1.2, #bara
  'condenserPressure':  1.2, #bara
  'condenserTemperature': 100.0, #C
  'reboilerTemperature': 200.0, #C
  'strippingGasRate': 180.0, #kg/hr
  "strippingGasFeedTemperature": 80.0, #C
  'bufferTankTemperatureTEG': 90.5,
  'hotTEGpumpPressure': 3.0, #bara
  "regenerationGasCoolerTemperature": 50.0, #C
}]

X_point = pd.DataFrame(inputData1)
ypredNew = regr_1.predict(X_point)
print('trained model \n', ypredNew)

trained model 
 [[  4.28967488 253.91582202 196.3731178 ]]


