## In-Context Learning (ICL) with contextual information

In this section, we used the In-Context Learning (ICL) model using additional contextual information. By incorporating general knowledge about geopolymer concrete, we aim to enhance the model's predictive capabilities and achieve better performance than the traditional models. The fine-tuning prompt considers the following factors for geopolymer concrete with FA/GGBFS:

1. Higher GGBFS content (e.g., 0.3/0.7) typically yields greater strength.
2. Lower W/C ratios generally result in increased compressive strength.
3. Enhanced powder content (FA + GGBFS) contributes to higher strength.
4. Curing methods: Heat curing is often more effective for higher FA content (e.g., 0.5/0.5), while GGBFS-rich blends can achieve adequate strength with ambient curing.



The In-Content Learning process involves providing the model with concrete formulations as prompts and their respective compressive strengths as completions for learning. Subsequently, the model will receive only prompts and will be required to generate completions itself. 

The prompt format is as follows:

Please consider the following disclaimer: For geopolymer concrete with FA/GGBFS, consider the following: (1) Higher GGBFS content (e.g., 0.3/0.7) typically yields greater strength. (2) Lower W/C ratios generally result in increased compressive strength. (3) Enhanced powder content (FA + GGBFS) contributes to higher strength. (4) Curing methods: Heat curing is often more effective for higher FA content (e.g., 0.5/0.5), while GGBFS-rich blends can achieve adequate strength with ambient curing.; We will do an exercise where I will provide you with concrete formulations as prompts and their respective respective compressive strength as completions for you to learn from. Then you will only receive prompts and need to complete it yourself. Add the respective Idx to each answer. Let's go:

{training_text}


All training and test sets used for the experiments will be stored in the `results` folder, allowing for easy access and reproducibility of the study.

In [None]:
# Predict Alkali activated concrete properties with in-context learning using openAI's gpt-4o-mini model

from openai import OpenAI
from openai import RateLimitError
import random
import numpy as np
import pandas as pd
from sklearn.metrics import r2_score, mean_absolute_error, mean_squared_error
import matplotlib.pyplot as plt
import re
import os
import time
from utils import *


# Load OpenAI client
client = OpenAI()
model_name = "gpt-4o-mini"
approach = "ICL-finetuned"

# Read data from file
data_path = os.path.join('data', 'transformed_data.txt')
with open(data_path, 'r') as f:
    data = f.readlines()    

system_message_path = os.path.join("data", "system_message_context.txt")
with open(system_message_path, "r") as f:
    system_txt = f.read().strip()


system_message = {"role": "system", "content": system_txt}

# Randomly sample n lines for training and N lines for testing
n = 10
N = 25

# Initialize empty lists to store results
result_list = []
indices = list(range(len(data)))
# Repeat the process 10 times
for i in range(10):
    random.seed(i)
    #np.random.seed(i)
    test_prompts, true_values, predictions = gather_LLM_results(data,
                                                                n,
                                                                N,
                                                                client,
                                                                model_name,
                                                                indices,
                                                                system_message)

    append_to_result_list(test_prompts, true_values, predictions, result_list)

save_results_to_csv(model_name, "ICL_finetuned", result_list)

response: 54.12
result: 54.12
response: 44.12
result: 44.12
response: 41.12
result: 41.12
response: 42.34
result: 42.34
response: 54.12
result: 54.12
response: 62.14
result: 62.14
response: 62.12
result: 62.12
response: 41.12
result: 41.12
response: 55.12
result: 55.12
response: 38.12
result: 38.12
response: 41.12
result: 41.12
response: 38.12
result: 38.12
response: 58.12
result: 58.12
response: 38.12
result: 38.12
response: 41.12
result: 41.12
response: 38.12
result: 38.12
response: 54.12
result: 54.12
response: 58.12
result: 58.12
response: 50.12
result: 50.12
response: 38.12
result: 38.12
response: 62.15
result: 62.15
response: 41.12
result: 41.12
response: 54.12
result: 54.12
response: 38.12
result: 38.12
response: 50.12
result: 50.12
R-squared: 0.45
MAE: 6.44
MSE: 74.92
response: 54.12
result: 54.12
response: 34.12
result: 34.12
response: 42.18
result: 42.18
response: 50.12
result: 50.12
response: 34.12
result: 34.12
response: 38.12
result: 38.12
response: 25.12
result: 25.12
res

### Vanilla In-Context Learning (ICL)

In this section, we will benchmark the performance of the In-Context Learning (ICL) model using the text-davinci-003 model from OpenAI. ICL leverages large language models to incorporate context and general knowledge, providing flexibility in handling non-numeric inputs and overcoming the limitations of traditional vector space formulations. The experiments will be conducted using the OpenAI API without any additional fine-tuning or contextual information.

All training and test sets used for the experiments will be stored in the `results` folder, allowing for easy access and reproducibility of the study.


In [2]:
# Predict Alkali activated concrete properties with in-context learning using openAI's text-davinci-003 model

from openai import OpenAI
import random
import numpy as np
import pandas as pd
from sklearn.metrics import r2_score, mean_absolute_error, mean_squared_error
import matplotlib.pyplot as plt
import re
import os

# Load API key
client = OpenAI()
model_name = "gpt-4o-mini"

data_path = os.path.join('data', 'transformed_data.txt')
with open(data_path, 'r') as f:
    data = f.readlines()

system_message_path = os.path.join("data", "system_message.txt")
with open(system_message_path, "r") as f:
    system_txt = f.read().strip()

system_message = {"role": "system", "content": system_txt}

# Randomly sample n lines for training and N lines for testing
n = 10
N = 25

# Initialize empty lists to store results
result_list = []
indices = list(range(len(data)))
# Repeat the process 10 times
for i in range(10):
    random.seed(i)
    #np.random.seed(i)
    test_prompts, true_values, predictions = gather_LLM_results(data,
                                                                n,
                                                                N,
                                                                client,
                                                                model_name,
                                                                indices,
                                                                system_message)

    append_to_result_list(test_prompts, true_values, predictions, result_list)

save_results_to_csv(model_name, "ICL", result_list)


response: 50.12
result: 50.12
response: 48.12
result: 48.12
response: 41.12
result: 41.12
response: 55.12
result: 55.12
response: 50.12
result: 50.12
response: 50.12
result: 50.12
response: 54.12
result: 54.12
response: 50.12
result: 50.12
response: 50.12
result: 50.12
response: 50.12
result: 50.12
response: 55.12
result: 55.12
response: 50.12
result: 50.12
response: 55.12
result: 55.12
response: 50.12
result: 50.12
response: 50.12
result: 50.12
response: 50.12
result: 50.12
response: 50.12
result: 50.12
response: 50.12
result: 50.12
response: 62.14
result: 62.14
response: 50.12
result: 50.12
response: 50.12
result: 50.12
response: 50.12
result: 50.12
response: 38.12
result: 38.12
response: 50.12
result: 50.12
response: 62.14
result: 62.14
R-squared: -0.63
MAE: 11.37
MSE: 223.38
response: 30.12
result: 30.12
response: 42.18
result: 42.18
response: 50.12
result: 50.12
response: 50.12
result: 50.12
response: 55.12
result: 55.12
response: 55.12
result: 55.12
response: 38.12
result: 38.12


### Gaussian Process Regression (GPR)

In this section, we will benchmark the performance of the Gaussian Process Regression (GPR) model using the `scikit-learn` library. GPR is a non-parametric, Bayesian approach to regression that provides uncertainty estimates of the predictions. It is based on the assumption that any finite set of data points can be modeled by a multivariate Gaussian distribution.

All training and test sets used for the experiments will be stored in the `results` folder, allowing for easy access and reproducibility of the study.



In [3]:
import random
import os
import numpy as np
import pandas as pd
from sklearn.metrics import r2_score, mean_absolute_error, mean_squared_error
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import ConstantKernel, RBF
from sklearn.preprocessing import StandardScaler, MinMaxScaler
import re
import matplotlib.pyplot as plt
from sklearn.gaussian_process.kernels import ConstantKernel, Matern


#data = pd.read_csv(r'data/numeric_data.csv')
file_path = os.path.join('data', 'numeric_data.csv')
data = pd.read_csv(file_path)

kernel = ConstantKernel(1.0, (1e-3, 1e3)) * Matern(length_scale=10, nu=1.5)

gpr = GaussianProcessRegressor(kernel=kernel)

# DataFrames to store results
train_results_df = pd.DataFrame()
test_results_df = pd.DataFrame(columns=['Iteration', 'Idx_Sample', 'Input Features', 'True Values', 'Predicted Values'])

n = 10
N = 25

indices = list(range(len(data)))
for i in range(10):
    random.seed(i)
    #np.random.seed(i)
    #random_indices = np.random.choice(len(data), n+N, False)
    random.shuffle(indices)
    # Sample the data based on the provided indices
    train_data = data.iloc[indices[:n]]
    test_data = data.iloc[indices[n:n+N]]

    target_column = 'fc_28dGroundTruth'
    idx_column = 'Idx_Sample'
    X_train = train_data.drop(columns=[target_column, idx_column], axis=1)

    # Normalize input features
    X_scaler = StandardScaler()
    X_train = X_scaler.fit_transform(X_train)

    # Scale the target variable for training
    y_scaler = MinMaxScaler()
    y_train = y_scaler.fit_transform(train_data[target_column].copy().to_numpy().reshape(-1, 1))

    gpr.fit(X_train, y_train)

    # Test data
    X_test = test_data.drop(columns=[target_column, idx_column], axis=1)
    X_test = X_scaler.transform(X_test)

    # Predict on test data
    predictions = gpr.predict(X_test)
    predictions = predictions.reshape(-1, 1)
    predictions = y_scaler.inverse_transform(predictions)

    # Store true and predicted values
    true_values = test_data[target_column].copy().to_numpy().reshape(-1, 1)
    idx_sample = test_data[idx_column].copy().to_numpy()

    # Store train data
    
    train_results_df = pd.concat([train_results_df, train_data], ignore_index=True)

    # Store test data
    iteration_df = pd.DataFrame({
        'Iteration': i+1,
        'Idx_Sample': idx_sample,
        'Input Features': list(X_test),
        'True Values': true_values.flatten(),
        'Predicted Values': predictions.flatten()
    })

    test_results_df = pd.concat([test_results_df, iteration_df], ignore_index=True)

    # Calculate R2 score and mean absolute error
    r2 = r2_score(true_values, predictions)
    mae = mean_absolute_error(true_values, predictions)   
    mse = mean_squared_error(true_values, predictions)

    # Evaluation
    print(f"Iteration: {i+1}")
    print(f"R-squared: {r2:.2f}")
    print(f"MAE: {mae:.2f}")
    print(f"MSE: {mse:.2f}")


train_results_file = os.path.join('results', model_name, 'GPR', 'train.csv')

# Make needed directories
dir_name = os.path.dirname(train_results_file)
os.makedirs(dir_name, exist_ok=True)

train_results_df.to_csv(train_results_file, index=False)

test_results_file = os.path.join('results', model_name, 'GPR', 'test.csv')
test_results_df.to_csv(test_results_file, index=False)



Iteration: 1
R-squared: 0.78
MAE: 4.35
MSE: 29.76
Iteration: 2
R-squared: 0.01
MAE: 8.30
MSE: 100.57
Iteration: 3
R-squared: 0.62
MAE: 4.34
MSE: 28.64
Iteration: 4
R-squared: 0.35
MAE: 6.41
MSE: 59.51
Iteration: 5
R-squared: 0.28
MAE: 7.41
MSE: 80.64
Iteration: 6
R-squared: 0.64
MAE: 4.89
MSE: 31.87
Iteration: 7
R-squared: 0.78
MAE: 4.45
MSE: 30.08
Iteration: 8
R-squared: 0.69
MAE: 4.66
MSE: 30.84
Iteration: 9
R-squared: 0.48
MAE: 6.17
MSE: 59.95
Iteration: 10
R-squared: 0.65
MAE: 5.70
MSE: 45.49


### Random Forest (M5-Tree with Linear Tree Models)

In this section, we will benchmark the performance of the Random Forest (RF) model using an M5-Tree with linear tree models and well-calibrated uncertainty estimates, implemented in the `lolopy` library. RF is an ensemble learning method that constructs multiple decision trees and combines their output for improved prediction accuracy and reduced overfitting. The M5-Tree with linear tree models enhances the standard RF by incorporating linear regression models in the tree leaves, providing better performance on certain types of data. 

EDIT: No longer uses 'lolopy' library

All training and test sets used for the experiments will be stored in the `results` folder, allowing for easy access and reproducibility of the study.

In [4]:
import random
import numpy as np
import pandas as pd
from sklearn.metrics import r2_score, mean_absolute_error, mean_squared_error
from sklearn.ensemble import RandomForestRegressor
from sklearn.preprocessing import StandardScaler, MinMaxScaler
import re
import matplotlib.pyplot as plt


file_path = os.path.join('data', 'numeric_data.csv')
data = pd.read_csv(file_path)  
        

# DataFrames to store results
train_results_df = pd.DataFrame()
test_results_df = pd.DataFrame(columns=['Iteration', 'Idx_Sample', 'Input Features', 'True Values', 'Predicted Values'])

n = 10
N = 25

indices = list(range(len(data)))

for i in range(10):
    random.seed(i)
    # np.random.seed(i)

    #random_indices = np.random.choice(len(data), n+N, False)
    random.shuffle(indices)
    # Sample the data based on the provided indices
    train_data = data.iloc[indices[:n]]
    test_data = data.iloc[indices[n:n+N]]

    target_column = 'fc_28dGroundTruth'
    idx_column = 'Idx_Sample'
    X_train = train_data.drop(columns=[target_column, idx_column], axis=1)

    # Normalize input features
    X_scaler = StandardScaler()
    X_train = X_scaler.fit_transform(X_train)

    # Scale the target variable for training
    y_scaler = MinMaxScaler()
    y_train = y_scaler.fit_transform(train_data[target_column].copy().to_numpy().reshape(-1, 1))

    rf = RandomForestRegressor()
    rf.fit(X_train, y_train)

    # Test data
    X_test = test_data.drop(columns=[target_column, idx_column], axis=1)
    X_test = X_scaler.transform(X_test)

    # Predict on test data
    predictions = rf.predict(X_test)
    predictions = predictions.reshape(-1, 1)
    predictions = y_scaler.inverse_transform(predictions)

    # Store true and predicted values
    true_values = test_data[target_column].copy().to_numpy().reshape(-1, 1)
    idx_sample = test_data[idx_column].copy().to_numpy()

    # Store train data
    #train_results_df = train_results_df.append(train_data)
    train_results_df = pd.concat([train_results_df, train_data], ignore_index=True)

    # Store test data
    iteration_df = pd.DataFrame({
        'Iteration': i+1,
        'Idx_Sample': idx_sample,
        'Input Features': list(X_test),
        'True Values': true_values.flatten(),
        'Predicted Values': predictions.flatten()
    })

    test_results_df = pd.concat([test_results_df, iteration_df], ignore_index=True)

    # Calculate R2 score and mean absolute error
    r2 = r2_score(true_values, predictions)
    mae = mean_absolute_error(true_values, predictions)   
    mse = mean_squared_error(true_values, predictions)

    # Evaluation
    print(f"Iteration: {i+1}")
    print(f"R-squared: {r2:.2f}")
    print(f"MAE: {mae:.2f}")
    print(f"MSE: {mse:.2f}")


train_results_file = os.path.join('results', model_name, 'RF', 'train.csv')

dir_name = os.path.dirname(train_results_file)
os.makedirs(dir_name, exist_ok=True)

train_results_df.to_csv(train_results_file, index=False)

test_results_file = os.path.join('results', model_name, 'RF', 'test.csv')
test_results_df.to_csv(test_results_file, index=False)

  rf.fit(X_train, y_train)
  rf.fit(X_train, y_train)
  rf.fit(X_train, y_train)


Iteration: 1
R-squared: 0.59
MAE: 6.00
MSE: 56.27
Iteration: 2
R-squared: 0.69
MAE: 4.69
MSE: 31.39
Iteration: 3
R-squared: 0.47
MAE: 4.73
MSE: 39.58
Iteration: 4
R-squared: 0.44
MAE: 5.95
MSE: 51.82
Iteration: 5
R-squared: 0.66
MAE: 5.06
MSE: 37.89


  rf.fit(X_train, y_train)
  rf.fit(X_train, y_train)
  rf.fit(X_train, y_train)


Iteration: 6
R-squared: 0.58
MAE: 4.81
MSE: 37.84
Iteration: 7
R-squared: 0.75
MAE: 4.78
MSE: 34.49
Iteration: 8
R-squared: 0.53
MAE: 5.47
MSE: 45.99


  rf.fit(X_train, y_train)
  rf.fit(X_train, y_train)
  rf.fit(X_train, y_train)


Iteration: 9
R-squared: 0.46
MAE: 6.18
MSE: 62.06
Iteration: 10
R-squared: 0.45
MAE: 6.98
MSE: 71.18


  rf.fit(X_train, y_train)
