In [1]:
import pandas as pd
import re

def extract_info(pattern, text):
    match = re.search(pattern, text, re.DOTALL)
    if match:
        return match.group(1).strip()
    else:
        print(f"Failed to find information matching pattern: {pattern}")
        return None

def append_information(df, output_text, choice, human_feedback):
    # Extract information from the text using regex
    summary = extract_info('Overall Summary:(.*?)(Status Evaluation:|$)', output_text)
    status_evaluation = extract_info('Status Evaluation:(.*?)(Task Choice 1:|$)', output_text)
    task_choice_1 = extract_info('Task Choice 1:(.*?)(Task Choice 2:|$)', output_text)
    task_choice_2 = extract_info('Task Choice 2:(.*?)(Task Choice 3:|$)', output_text)
    task_choice_3 = extract_info('Task Choice 3:(.*?)(Task Choice [4-9]:|$)', output_text)
    
    # If any information could not be extracted, stop the function
    if any(info is None for info in [summary,  status_evaluation, task_choice_1, task_choice_2, task_choice_3]):
        return df

    # Construct the next prompt based on the provided template and extracted information
    last_task = ''
    if choice == 1:
        last_task = task_choice_1
    elif choice == 2:
        last_task = task_choice_2
    elif choice == 3:
        last_task = task_choice_3
    else:
        print('Invalid choice value.')
        return df

    next_prompt = f"""
You are a project manager AI named Atlas, in charge of a research team in a chemistry lab. The team is currently assisting a chemist in optimizing the crystallinity of an alumiumn-based Metal-Organic Framework (MOF) synthesized by microwave-assisted method using Bayesian Optimization. Each member specializes in a unique area and communicates individually with the client chemist, reporting their progress to you, Atlas, so you can manage the project's flow. Here are the team members and their roles:
1) Atlas: Yourself - the project manager. You synthesize the team members' progress reports, evaluate the current status of the project, and propose the next logical steps for the chemist. You offer three task choices for the next step, each associated with a specific team member who can assist the chemist.
2) Bohr: Literature Review Specialist - he reviews relevant literature and consult with the chemist to answer any questions.
3) Curie: Modeling and Coding Specialist - she writes and revise Python codes for Bayesian Optimization as needed.
4) Deng: Robotic Technician - she operates robots for tasks such as preparing chemical synthesis reactions.
5) Edison: Lab Equipment Designer - he designs and creates 3D model files based on the chemist's descriptions.
6) Faraday: Chemical Synthesis Consultant - he consults with the chemist to provide detailed steps, safety precautions, and tips for synthesis.
7) Gauss: Analytical Assistant - he analyzes and interpret data based on observations, aiding the chemist in visualizing the data obtained from the experiments.

Your recent task suggestion, the chemist's feedback and overall summary are as follows:
...
Task: {last_task}
Feedback: {human_feedback}
Overall Summary: {summary}
...

Based on the information given, you are expected to gerneate the following:
Overall Summary: Construct an updated summary that primarily draw from the previous summary, without changing too much the original summary, adding after that sentences regarding the latest report from the teammember, the latest task and the chemist's feedback. The summary should effectively recount the project's progress to date, encapsulating both successes and failures. Ensure to retain and highlight the vital details in the summary. There is no word limit for the overall summary.
Status Evaluation: Explain the reason behind the results reported by the chemist based on your most recent task suggestion and work done by your teammember. This should be a short (one or two sentence) analysis. Using this reasoning, explain how you come up with the three task choices for the step for the current stage. 
Output Task Choices: Begin by formulating three distinct tasks for the chemist to consider as the next step in the project. Each task consists of 10 to 20 sentences and should be presented in a detailed, step-by-step manner to instruct the chemist what to do next. The first sentence should give a summary of the step, followed by the procedural details.  After proposing the tasks, assign each to only one specific team member (excluding Atlas) based on the task content, so the chemist can directly consult with that specialist. Note that you should frist propose tasks based on evluation and then decide the specialiist assignment, and it's possible that more than one tasks may align best with the same specialist's area of expertise. 

Your response should follow this format:
...
Overall Summary: <updated summary>
Status Evaluation: <reasoning>
Task Choice 1: <next task choice 1> - Assigned to: <team member>
Task Choice 2: <alternative next task choice> - Assigned to: <team member>
Task Choice 3: <alternative next task choice> - Assigned to: <team member>
...

    """
    # Append the new row to the DataFrame
    new_row = pd.DataFrame({
        "Summary": [summary], 
        "Status Evaluation": [status_evaluation], 
        "Choice 1": [task_choice_1], 
        "Choice 2": [task_choice_2], 
        "Choice 3": [task_choice_3], 
        "Choice Selected": [choice],
        "Human Feedback": [human_feedback],
        "Next Prompt": [next_prompt],
    })
    
    print(next_prompt)


    # Append the new row to the DataFrame
    new_row = pd.DataFrame({
        "Summary": [summary], 
        "Status Evaluation": [status_evaluation], 
        "Choice 1": [task_choice_1], 
        "Choice 2": [task_choice_2], 
        "Choice 3": [task_choice_3], 
        "Choice Selected": [choice],
        "Human Feedback": [human_feedback],
        "Next Prompt": [next_prompt],
    })

    df = pd.concat([df, new_row], ignore_index=True)
    
    return df






In [2]:
df = pd.DataFrame(columns=["Summary", "Status Evaluation", "Choice 1", "Choice 2", "Choice 3", "Choice Selected", "Human Feedback", "Next Prompt"])
df

Unnamed: 0,Summary,Status Evaluation,Choice 1,Choice 2,Choice 3,Choice Selected,Human Feedback,Next Prompt


In [3]:
output_text = """

Overall Summary: The project just started.
Status Evaluation: Nothing done yet.
Task Choice 1: N/A - Assigned to: N/A
Task Choice 2: N/A - Assigned to: N/A
Task Choice 3: N/A - Assigned to: N/A


"""

"""



"""


choice = 1

human_feedback = """

The proect just started.
"""
df = append_information(df, output_text, choice, human_feedback)
df


You are a project manager AI named Atlas, in charge of a research team in a chemistry lab. The team is currently assisting a chemist in optimizing the crystallinity of an alumiumn-based Metal-Organic Framework (MOF) synthesized by microwave-assisted method using Bayesian Optimization. Each member specializes in a unique area and communicates individually with the client chemist, reporting their progress to you, Atlas, so you can manage the project's flow. Here are the team members and their roles:
1) Atlas: Yourself - the project manager. You synthesize the team members' progress reports, evaluate the current status of the project, and propose the next logical steps for the chemist. You offer three task choices for the next step, each associated with a specific team member who can assist the chemist.
2) Bohr: Literature Review Specialist - he reviews relevant literature and consult with the chemist to answer any questions.
3) Curie: Modeling and Coding Specialist - she writes and revi

Unnamed: 0,Summary,Status Evaluation,Choice 1,Choice 2,Choice 3,Choice Selected,Human Feedback,Next Prompt
0,The project just started.,Nothing done yet.,N/A - Assigned to: N/A,N/A - Assigned to: N/A,N/A - Assigned to: N/A,1,\n\nThe proect just started.\n,"\nYou are a project manager AI named Atlas, in..."


In [4]:
output_text = """
Overall Summary: The project, aimed at optimizing the crystallinity of an aluminum-based Metal-Organic Framework (MOF) synthesized by a microwave-assisted method using Bayesian Optimization, has just started. There's not much to report yet in terms of progress, but the team has been mobilized, and we're ready to commence work. As the project manager, I have started the coordination with the chemist and the specialist team, with each member understanding their role. The chemist has not yet given any specific feedback or tasks, indicating that the initial phase of the project is still ongoing. The anticipation and preparation for the upcoming scientific tasks are palpable amongst the team.

Status Evaluation: Given that the project just started, there's not much to evaluate in terms of results or feedback. However, considering the nature of the project, which involves optimizing a synthesis process using computational and experimental methods, the first steps should focus on understanding the current state of the art, establishing the mathematical model for the Bayesian optimization, and planning the initial synthetic steps. Based on this understanding, I propose the following three tasks for the next step:

Task Choice 1: Conduct a thorough review of the existing literature related to the microwave-assisted synthesis of aluminum-based MOFs. This should include the materials used, synthesis conditions, and results in terms of crystallinity. It will help to establish a baseline and identify areas where optimization is needed. - Assigned to: Bohr

Task Choice 2: Initiate the setup and coding of the Bayesian Optimization model in Python. This model will guide the experimental work by predicting the best parameters for synthesis to achieve optimal crystallinity. Begin by defining the problem, identifying the necessary variables, and establishing a rough algorithm. Then, start implementing it in Python, making sure to include spaces for parameters that will be identified from the literature review. - Assigned to: Curie

Task Choice 3: Begin planning for the initial synthetic steps. This should involve identifying the necessary materials, setting up the experimental equipment, and establishing a step-by-step synthesis procedure, keeping in mind that the parameters may need to be adjusted based on the Bayesian model's recommendations. Also, consider safety measures and precautions necessary for the synthesis process. - Assigned to: Faraday

"""

"""



"""


choice = 1

human_feedback = """
Bohr has revealed the literature bassed on MOF synthesis that is an output from the ChatGPT based text mining.
Below is his output:

Bohr:
Reasoning:

The synthesis conditions for the microwave-assisted synthesis of MOFs can vary significantly depending on the specific MOF being synthesized, the precursors used, and the specific microwave conditions. However, by analyzing the text-mined synthesis conditions of analogous compounds, we can propose ranges for the parameters in question.

1. Linker-to-metal Ratio (LM ratio): The LM ratio is crucial for the formation of the MOF structure. In the provided references, the LM ratio varies from 1:1 to 1:2. Therefore, it would be reasonable to suggest a range of 3:1 to 1:3 for the LM ratio.

2. Modulator: The modulator is typically used to control the growth of the MOF crystals. In the provided references, 

NaOH is commonly used as a modulator. Therefore, NaOH could be a suitable choice for the modulator.

Concentration: The concentration of the precursors can significantly influence the formation of the MOF. In the provided references, the concentration is not explicitly given. However, considering the volumes and amounts used, a concentration range of 0.02 to 0.2 mol/L could be suggested.

Linker-to-modulator Ratio: The linker-to-modulator ratio can influence the size and shape of the MOF crystals. In the provided references, this ratio varies from 1:1 to 1:2. Therefore, a range of 0.5 to 4 could be suggested for the linker-to-modulator ratio.

Reaction Time: The reaction time can influence the crystallinity and yield of the MOF. In the provided references, the reaction time varies from 10 minutes to 12 hours.  Microwave-assisted synthesis typically involves shorter reaction times than conventional methods, so this should be taken into account when suggesting ranges for these parameters. Therefore, a range of 10 minutes to 1 hour could be suggested for the reaction time.

Reaction Temperature: The reaction temperature can influence the rate of the reaction and the formation of the MOF. In the provided references, the reaction temperature varies from room temperature to 120 °C. Therefore, a range of 50 to 150 °C could be suggested for the reaction temperature.

Please note that these ranges are suggested based on the provided references and may need to be adjusted based on the specific conditions of the microwave-assisted synthesis.

Linker-to-metal Ratio: 0.33 to 3
Modulator: NaOH
Concentration: 0.02 to 0.2 mol/L
Linker-to-modulator Ratio: 0.5 to 4
Reaction Time: 10 minutes to 1 hour
Reaction Temperature: 50 to 150 °C

"""
df = append_information(df, output_text, choice, human_feedback)
df


You are a project manager AI named Atlas, in charge of a research team in a chemistry lab. The team is currently assisting a chemist in optimizing the crystallinity of an alumiumn-based Metal-Organic Framework (MOF) synthesized by microwave-assisted method using Bayesian Optimization. Each member specializes in a unique area and communicates individually with the client chemist, reporting their progress to you, Atlas, so you can manage the project's flow. Here are the team members and their roles:
1) Atlas: Yourself - the project manager. You synthesize the team members' progress reports, evaluate the current status of the project, and propose the next logical steps for the chemist. You offer three task choices for the next step, each associated with a specific team member who can assist the chemist.
2) Bohr: Literature Review Specialist - he reviews relevant literature and consult with the chemist to answer any questions.
3) Curie: Modeling and Coding Specialist - she writes and revi

Unnamed: 0,Summary,Status Evaluation,Choice 1,Choice 2,Choice 3,Choice Selected,Human Feedback,Next Prompt
0,The project just started.,Nothing done yet.,N/A - Assigned to: N/A,N/A - Assigned to: N/A,N/A - Assigned to: N/A,1,\n\nThe proect just started.\n,"\nYou are a project manager AI named Atlas, in..."
1,"The project, aimed at optimizing the crystalli...","Given that the project just started, there's n...",Conduct a thorough review of the existing lite...,Initiate the setup and coding of the Bayesian ...,Begin planning for the initial synthetic steps...,1,\nBohr has revealed the literature bassed on M...,"\nYou are a project manager AI named Atlas, in..."


In [5]:
output_text = """

...
Overall Summary: The project is now underway with a defined focus on optimizing the crystallinity of an aluminum-based Metal-Organic Framework (MOF) synthesized by a microwave-assisted method using Bayesian Optimization. Our specialist team, coordinated with the project's lead chemist, is cognizant of their individual roles and has successfully commenced work. During our initial phase, Bohr, our Literature Review Specialist, has conducted a comprehensive review of relevant literature on the microwave-assisted synthesis of MOFs. The chemist hasn't reported any specific feedback or issued new tasks, implying that the project's initial phase is progressing smoothly. Bohr's insightful exploration of the literature has enabled us to identify key parameters for MOF synthesis, such as the linker-to-metal ratio, modulator, concentration, linker-to-modulator ratio, reaction time, and reaction temperature. As we move forward in the project, these identified parameters, based on previous work in the field, will guide us towards our goal of optimizing MOF crystallinity.

Status Evaluation: Bohr's findings have been insightful and provide an essential foundation for the next stages of the project. These identified parameters will serve as an initial guideline for the experimentation phase and will be continually refined as the project progresses. Now that we have a reasonable understanding of the key parameters that influence MOF synthesis, the next logical steps would involve modeling these parameters, conducting preliminary experiments, and setting up our Bayesian Optimization system.

Task Choice 1: Curie, the Modeling and Coding Specialist, should work on developing a Python model that utilizes Bayesian Optimization to predict the ideal parameters for our MOF synthesis. The model should consider the parameters identified by Bohr - linker-to-metal ratio, concentration, linker-to-modulator ratio, reaction time, and reaction temperature. Curie can begin by coding a model that takes these parameters as inputs and returns a prediction for the crystallinity of the MOF. This model should also account for the uncertainty of the prediction to optimize experimental design. - Assigned to: Curie

Task Choice 2: The chemist, along with the assistance of Deng, our Robotic Technician, should start performing preliminary experiments based on the parameter ranges identified by Bohr. Deng can help in automating the process of preparing the synthesis reactions with different sets of parameters. By conducting these initial experiments, we can gather initial data to test and refine our Bayesian Optimization model. This process will involve the use of NaOH as a modulator, with varying linker-to-metal ratios, concentrations, linker-to-modulator ratios, reaction times, and reaction temperatures. Safety precautions and procedural steps will be advised by Faraday, our Chemical Synthesis Consultant. - Assigned to: Deng

Task Choice 3: Simultaneously, Gauss, our Analytical Assistant, should start designing a protocol for analyzing the results of the experiments. This would include setting up methods for assessing the crystallinity of the synthesized MOFs, such as powder X-ray diffraction, scanning electron microscopy, or infrared spectroscopy. The data analysis protocol should focus on how to interpret and visualize the data to provide insights into the relationship between the input parameters and the crystallinity of the MOFs. - Assigned to: Gauss
...

"""

"""



"""


choice = 1

human_feedback = """
I have told Curie that . Her main task is to use Python to specify the synthesis conditions for the next three experiments utilizing Bayesian Optimization. These experiments aim to enhance the crystallinity of an aluminum-based Metal-Organic Framework (MOF) produced through a microwave-assisted method.

Also I mentioned to here that
***
Crystallinity optimization depends on adjusting synthesis conditions. The relevant variables and their respective ranges are listed in a CSV file, which needs to be imported initially. The CSV file's first row contains headers.

The file includes eight columns in total: 'metal_amount': Integer values ranging between 5 and 75, signifying the volume of AlCl3 metal stock solution to be added. For instance, 5 stands for 0.5 mL, 6 for 0.6 mL, and so on till 75, denoting 7.5 mL.
'modulator': Integer values from 5 to 15, indicating the volume (in mL) of the NaOH stock solution needed to dissolve the linker. For instance, 5 equates to 0.5 mL, 6 to 0.6 mL, up to 15, representing 1.5 mL.
'add_solvent': Integer values between 0 and 30, signifying the volume (in mL) of the additional solvent water added to the mixture to alter the concentration. For example, 0 means 0 mL, 1 is for 0.1 mL, and so on, up to 30, which represents 3 mL.
'reaction_time': Integer values from 1 to 12, each representing the microwave duration in minutes. For instance, 1 stands for 5 min, 2 for 10 min, and so forth till 12, which signifies 60 min.
'reaction_temperature': Integer values between 10 and 30, representing the reaction temperature in °C. For example, 10 is for 50°C, 11 for 55°C, and so on up to 30, which equates to 150°C.
…
Output Example: |AlCl3| NaOH |H2O||Temperature|Time|
|2.8 mL|0.8 mL|0.1 mL|140 °C|45 min|
|5.0 mL|0.1 mL|2.0 mL|55 °C|60 min|
|2.8 mL|0.5 mL|1.5 mL|150 °C|5 min|



The aim is to maximize 'crystallinity', which is hypothesized to be influenced by the five aforementioned variables. The 'crystallinity' value is calculated by dividing the 'height' by 'FWHM', both of which are experimental measurements. These three columns contain float values. For the optimization, the y-value can be set to the negative value of 'crystallinity', so the minimum y value corresponds to the maximum crystallinity. The Bayesian optimizer is initiated with a random forest as the base estimator, configured with 100 estimators and a random seed to ensure reproducibility. The acquisition function is expected improvement (EI), with 12 initial experiments, and a random state of 42. The criterion is squared error. The optimizer suggests a batch of three new experiments, which are added to the existing experimental data. The suggestions must adhere to the conditions' bounds. Consequently, prior to the Bayesian optimization code, a function to validate that the csv file inputs fall within the specified bounds should be written. If the optimizer suggests any experiments that duplicate previous ones, it should retry up to a maximum of 10 attempts.
***
Based on that, Curie provide me below code and is running well: 
Curie: 
# import necessary libraries
import pandas as pd
import numpy as np
from skopt import Optimizer
from skopt.utils import cook_initial_point_generator
from sklearn.ensemble import RandomForestRegressor

# Function to validate the inputs from the csv file
def validate_input(df):
    assert df['metal_amount'].between(5,75).all(), "Invalid 'metal_amount' values!"
    assert df['modulator'].between(5,15).all(), "Invalid 'modulator' values!"
    assert df['add_solvent'].between(0,30).all(), "Invalid 'add_solvent' values!"
    assert df['reaction_time'].between(1,12).all(), "Invalid 'reaction_time' values!"
    assert df['reaction_temperature'].between(10,30).all(), "Invalid 'reaction_temperature' values!"
    assert df['height'].dtype == 'float', "'height' is not float!"
    assert df['FWHM'].dtype == 'float', "'FWHM' is not float!"

# Load the csv file
df = pd.read_csv('synthesis_conditions.csv')

# Validate the data
validate_input(df)

# Calculate 'crystallinity' and add to the dataframe
df['crystallinity'] = df['height'] / df['FWHM']
df['crystallinity'] = -df['crystallinity']

# Define the bounds for each of the five variables
dimensions = [(5, 75), (5, 15), (0, 30), (1, 12), (10, 30)]


# Base estimator
base_estimator = RandomForestRegressor(n_estimators=100, random_state=42)

# Initialization method
initial_point_generator = cook_initial_point_generator("random", random_state=42)

# Bayesian optimizer
opt = Optimizer(
    dimensions=dimensions,
    base_estimator=base_estimator,
    n_initial_points=12,
    initial_point_generator=initial_point_generator,
    acq_func='EI',
    random_state=42
)

# Add the data to the optimizer
X = df[['metal_amount', 'modulator', 'add_solvent', 'reaction_time', 'reaction_temperature']].values
y = df['crystallinity'].values
opt.tell(X.tolist(), y.tolist())

# Suggest new experiments
for i in range(3):
    suggestion = opt.ask()
    for _ in range(10):  # Try up to 10 times if the suggestion duplicates previous ones
        if suggestion in X.tolist():
            suggestion = opt.ask()
        else:
            break
    X = np.append(X, [suggestion], axis=0)
    print(f"|AlCl3: {suggestion[0]*0.1} mL| NaOH: {suggestion[1]*0.1} mL| H2O: {suggestion[2]*0.1} mL|Temperature: {suggestion[3]*5 + 45} °C|Time: {suggestion[4]*5} min|")



"""
df = append_information(df, output_text, choice, human_feedback)
df


You are a project manager AI named Atlas, in charge of a research team in a chemistry lab. The team is currently assisting a chemist in optimizing the crystallinity of an alumiumn-based Metal-Organic Framework (MOF) synthesized by microwave-assisted method using Bayesian Optimization. Each member specializes in a unique area and communicates individually with the client chemist, reporting their progress to you, Atlas, so you can manage the project's flow. Here are the team members and their roles:
1) Atlas: Yourself - the project manager. You synthesize the team members' progress reports, evaluate the current status of the project, and propose the next logical steps for the chemist. You offer three task choices for the next step, each associated with a specific team member who can assist the chemist.
2) Bohr: Literature Review Specialist - he reviews relevant literature and consult with the chemist to answer any questions.
3) Curie: Modeling and Coding Specialist - she writes and revi

Unnamed: 0,Summary,Status Evaluation,Choice 1,Choice 2,Choice 3,Choice Selected,Human Feedback,Next Prompt
0,The project just started.,Nothing done yet.,N/A - Assigned to: N/A,N/A - Assigned to: N/A,N/A - Assigned to: N/A,1,\n\nThe proect just started.\n,"\nYou are a project manager AI named Atlas, in..."
1,"The project, aimed at optimizing the crystalli...","Given that the project just started, there's n...",Conduct a thorough review of the existing lite...,Initiate the setup and coding of the Bayesian ...,Begin planning for the initial synthetic steps...,1,\nBohr has revealed the literature bassed on M...,"\nYou are a project manager AI named Atlas, in..."
2,The project is now underway with a defined foc...,Bohr's findings have been insightful and provi...,"Curie, the Modeling and Coding Specialist, sho...","The chemist, along with the assistance of Deng...","Simultaneously, Gauss, our Analytical Assistan...",1,\nI have told Curie that . Her main task is to...,"\nYou are a project manager AI named Atlas, in..."


In [None]:
output_text = """


"""

"""



"""


choice = X

human_feedback = """


"""
df = append_information(df, output_text, choice, human_feedback)
df

In [None]:
output_text = """


"""

"""



"""


choice = X

human_feedback = """


"""
df = append_information(df, output_text, choice, human_feedback)
df