1.Frontend User Input:
Coal Ratios - user will select coal and percentage(check coal property data),
'Load',
'feed water temperature',
'Running plant load factor',
'Air to fuel ratio for mill'

2.Weighted average:
Coal Ratios and coal property data(Coal Name, TM, IM, ASH, VM, FC, GCV, GCV (ARB))

3.Model Input:
'Load',
'feed water temperature',
'Running plant load factor',
'Air to fuel ratio for mill',
'TM_WA', 'IM_WA', 'ASH_WA',
'VM_WA', 'FC_WA',
'GCV_WA', 'GCV (ARB)_WA'

4.Model Output:
'Boiler Efficiency',
'Nox',
'UBC in BA',
'UBC in FA'

5.Frontend Output:
'Boiler Efficiency',
'Nox',
'UBC in BA',
'UBC in FA',
'GCV_WA',
'GCV (ARB)_WA'


In [None]:
import pandas as pd
import ast

# -----------------------------
# Load CSVs
# -----------------------------
coal_df = pd.read_csv("coal_property_data.csv")
blend_df = pd.read_csv("New_Blend_dict.csv")

# Clean column names
coal_df.columns = coal_df.columns.str.strip()
coal_df["Coal_Name"] = (
    coal_df["Coal_Name"]
    .astype(str)
    .str.strip()
)

# Set Coal_Name as index for fast lookup
coal_df.set_index("Coal_Name", inplace=True)

# -----------------------------
# Weighted average function
# -----------------------------
def calculate_weighted_average(blend_str):
    blend_dict = ast.literal_eval(blend_str)

    # Initialize weighted sums
    weighted_sum = {
        "TM": 0, "IM": 0, "ASH": 0, "VM": 0,
        "FC": 0, "GCV": 0, "GCV (ARB)": 0
    }

    for coal, pct in blend_dict.items():
        if coal not in coal_df.index:
            raise ValueError(f"Coal '{coal}' not found in coal properties CSV")

        for prop in weighted_sum.keys():
            weighted_sum[prop] += coal_df.loc[coal, prop] * pct

    # Divide by 100 to get weighted average
    weighted_avg = {k: v / 100 for k, v in weighted_sum.items()}

    return weighted_avg

# -----------------------------
# Apply to blend column
# -----------------------------
blend_df["Weighted_Average"] = blend_df["Blend"].apply(calculate_weighted_average)

print(blend_df["Weighted_Average"])
blend_df.to_csv("Final_weighted.csv")


In [None]:
import pandas as pd
import ast

# Read CSV
df = pd.read_csv("Final_weighted.csv")

# Convert string dict → actual dict
df["Weighted_Average"] = df["Weighted_Average"].apply(ast.literal_eval)

# Expand dictionary into columns
expanded_df = df["Weighted_Average"].apply(pd.Series)

# Save to CSV
expanded_df.to_csv("weighted_average.csv", index=False)

print(expanded_df)


In [None]:
pip install pycaret pandas scikit-learn


Model Training

In [None]:
import pandas as pd
from pycaret.regression import *

# ================================
# 1. Load Data
# ================================
df = pd.read_csv("Training_data.csv")
df.columns = df.columns.str.strip()
# ================================
# 2. Define Input & Target Columns
# ================================
input_features = [
    'Load',
    'feed water temperature',
    'Running plant load factor',
    'Air to fuel ratio for mill',
    'TM_WA', 'IM_WA', 'ASH_WA',
    'VM_WA', 'FC_WA',
    'GCV_WA', 'GCV (ARB)_WA'
]

target_variables = [
    'Boiler Efficiency',
    'Nox',
    'UBC in BA',
    'UBC in FA'
]

# ================================
# 3. Loop Over Each Target Variable
# ================================
for target in target_variables:
    
    print(f"\n==============================")
    print(f"Training model for: {target}")
    print(f"==============================\n")
    
    # Select relevant columns
    model_df = df[input_features + [target]].dropna()
    
    # ----------------------------
    # PyCaret Setup
    # ----------------------------
    setup(
        data=model_df,
        target=target,
        session_id=42,
        verbose=False
    )
    
    # ----------------------------
    # Compare Models
    # ----------------------------
    best_model = compare_models()
    
    # ----------------------------
    # Finalize Model
    # ----------------------------
    final_model = finalize_model(best_model)
    
    # ----------------------------
    # Save Model
    # ----------------------------
    model_name = target.replace(" ", "_").lower() + "_model"
    save_model(final_model, model_name)
    
    print(f"Model saved as: {model_name}.pkl")

print("\n✅ All models trained and saved successfully.")



Training model for: Boiler Efficiency



Unnamed: 0,Model,MAE,MSE,RMSE,R2,RMSLE,MAPE,TT (Sec)
gbr,Gradient Boosting Regressor,0.0224,0.0042,0.0467,0.9869,0.0005,0.0003,0.021
ada,AdaBoost Regressor,0.0522,0.0044,0.0655,0.9862,0.0007,0.0006,0.017
catboost,CatBoost Regressor,0.042,0.005,0.066,0.9851,0.0007,0.0005,0.403
dt,Decision Tree Regressor,0.0064,0.0057,0.0238,0.9837,0.0003,0.0001,0.009
rf,Random Forest Regressor,0.031,0.007,0.0669,0.9798,0.0008,0.0004,0.04
xgboost,Extreme Gradient Boosting,0.0195,0.009,0.0542,0.9716,0.0006,0.0002,0.02
et,Extra Trees Regressor,0.0371,0.0104,0.09,0.9689,0.001,0.0004,0.027
lightgbm,Light Gradient Boosting Machine,0.0704,0.012,0.1023,0.9646,0.0012,0.0008,0.039
lr,Linear Regression,0.1818,0.0809,0.2769,0.7399,0.0031,0.0021,0.91
ridge,Ridge Regression,0.1967,0.0904,0.2946,0.708,0.0033,0.0023,0.011


Transformation Pipeline and Model Successfully Saved
Model saved as: boiler_efficiency_model.pkl

Training model for: Nox



Unnamed: 0,Model,MAE,MSE,RMSE,R2,RMSLE,MAPE,TT (Sec)
dt,Decision Tree Regressor,0.0,0.0,0.0,1.0,0.0,0.0,0.018
gbr,Gradient Boosting Regressor,0.0009,0.0,0.001,1.0,0.0,0.0,0.024
rf,Random Forest Regressor,0.0346,0.0288,0.1001,1.0,0.0007,0.0002,0.041
xgboost,Extreme Gradient Boosting,0.0001,0.0,0.0001,1.0,0.0,0.0,0.02
ada,AdaBoost Regressor,0.0,0.0,0.0,1.0,0.0,0.0,0.011
et,Extra Trees Regressor,0.3586,1.1484,0.9135,0.9992,0.0052,0.0021,0.032
catboost,CatBoost Regressor,1.8745,9.1012,2.8431,0.993,0.0193,0.0124,0.356
lightgbm,Light Gradient Boosting Machine,4.1053,73.5169,7.5595,0.9397,0.0419,0.0242,0.039
ridge,Ridge Regression,8.4881,137.5396,11.3363,0.8898,0.0643,0.0497,0.013
br,Bayesian Ridge,8.3957,137.693,11.3148,0.8896,0.0639,0.0489,0.008


Transformation Pipeline and Model Successfully Saved
Model saved as: nox_model.pkl

Training model for: UBC in BA



Unnamed: 0,Model,MAE,MSE,RMSE,R2,RMSLE,MAPE,TT (Sec)
gbr,Gradient Boosting Regressor,1.0426,2.147,1.4443,0.6909,0.1548,0.1373,0.022
rf,Random Forest Regressor,1.0108,2.0888,1.4225,0.6841,0.1539,0.1364,0.044
et,Extra Trees Regressor,1.039,2.244,1.4672,0.6473,0.1549,0.1373,0.031
catboost,CatBoost Regressor,1.053,2.3689,1.5079,0.6318,0.1634,0.143,0.35
ada,AdaBoost Regressor,1.1242,2.3895,1.5249,0.6161,0.1656,0.1554,0.023
xgboost,Extreme Gradient Boosting,1.1026,2.4785,1.5687,0.6112,0.1684,0.1464,0.022
lightgbm,Light Gradient Boosting Machine,1.2343,2.932,1.6895,0.5082,0.184,0.1717,0.039
ridge,Ridge Regression,1.3716,3.4055,1.8203,0.5025,0.1973,0.1865,0.011
br,Bayesian Ridge,1.3676,3.559,1.8633,0.4586,0.1987,0.1819,0.015
lr,Linear Regression,1.4105,3.9102,1.9138,0.4577,0.2074,0.1914,0.011


Transformation Pipeline and Model Successfully Saved
Model saved as: ubc_in_ba_model.pkl

Training model for: UBC in FA



Unnamed: 0,Model,MAE,MSE,RMSE,R2,RMSLE,MAPE,TT (Sec)
rf,Random Forest Regressor,0.9103,1.7272,1.2989,0.7026,0.1698,0.1631,0.042
gbr,Gradient Boosting Regressor,0.9536,1.835,1.3329,0.7004,0.1734,0.167,0.03
xgboost,Extreme Gradient Boosting,0.9687,1.9006,1.3488,0.6783,0.1839,0.175,0.027
catboost,CatBoost Regressor,0.9948,1.9311,1.3688,0.6594,0.1854,0.1813,0.359
et,Extra Trees Regressor,0.9514,1.8707,1.3448,0.6409,0.172,0.1667,0.037
ada,AdaBoost Regressor,1.0605,2.0512,1.419,0.632,0.1953,0.2073,0.021
lightgbm,Light Gradient Boosting Machine,1.1459,2.6358,1.595,0.5159,0.2187,0.2206,0.038
ridge,Ridge Regression,1.2586,3.1161,1.7245,0.504,0.2322,0.2334,0.01
dt,Decision Tree Regressor,1.0916,2.7184,1.5842,0.4899,0.2072,0.1893,0.014
br,Bayesian Ridge,1.2427,3.2555,1.761,0.4657,0.2325,0.2249,0.01


Transformation Pipeline and Model Successfully Saved
Model saved as: ubc_in_fa_model.pkl

✅ All models trained and saved successfully.
