## Rapid prototyping LLMs to generate potentially impactful experiments for carbon capture

In [1]:
from llm import chat_openai, chat_openai_high_temp
import json

In [None]:
# GPT template
prompt = f'''

'''
res = chat_openai(prompt, model="gpt-3.5-turbo")[0]
print(res)

In [None]:
# GPT template for human feedback
feedback = ""

res = chat_openai(prompt, model="gpt-3.5-turbo", chat_history=[
    {
        "role": "assistant",
        "content": f"{res}"
    }, {
        "role": "user",
        "content": f"{feedback}"
    }
])[0]
print(res)

### Give vision with important consequences

In [25]:
objective = "Carbon capture 1 Gt of CO2 per year ASAP"

In [26]:
# Brainstorm
prompt = f'''
Objective: 
{objective}

Task:
What are the key metrics to measure success?

The output should be in JSON format: 
```json
{{
  "<insert metric>" : "<insert metric description>",
  etc.
}}

Respond only with the output, with no explanation or conversation. I expect a first-rate answer.
'''
res = chat_openai(prompt, model="gpt-3.5-turbo")[0]
print(res)

Prompt messages:  [{'role': 'system', 'content': 'You are a helpful and educated carbon capture research consultant and an educated and helpful researcher and programmer. Answer as correctly, clearly, and concisely as possible.'}, {'role': 'user', 'content': '\nObjective: \nCarbon capture 1 Gt of CO2 per year ASAP\n\nTask:\nWhat are the key metrics to measure success?\n\nThe output should be in JSON format: \n```json\n{\n  "<insert metric>" : "<insert metric description>",\n  etc.\n}\n\nRespond only with the output, with no explanation or conversation. I expect a first-rate answer.\n'}]
Completion info:  {
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "message": {
        "content": "{\n  \"CO2 capture rate\": \"The rate at which CO2 is captured and removed from the atmosphere, measured in metric tons per year.\",\n  \"Energy consumption\": \"The amount of energy required for the carbon capture process, measured in kilowatt-hours per metric ton of CO2 capt

In [27]:
# Give feedback if necessary
feedback = "You forgot about the metrics for feasibility, easily measurable and permanence."

res = chat_openai(prompt, model="gpt-3.5-turbo", chat_history=[
    {
        "role": "assistant",
        "content": f"{res}"
    }, {
        "role": "user",
        "content": f"{feedback}"
    }
])[0]
print(res)


Prompt messages:  [{'role': 'system', 'content': 'You are a helpful and educated carbon capture research consultant and an educated and helpful researcher and programmer. Answer as correctly, clearly, and concisely as possible.'}, {'role': 'assistant', 'content': '{\n  "CO2 capture rate": "The rate at which CO2 is captured and removed from the atmosphere, measured in metric tons per year.",\n  "Energy consumption": "The amount of energy required for the carbon capture process, measured in kilowatt-hours per metric ton of CO2 captured.",\n  "Cost per metric ton of CO2 captured": "The cost incurred for capturing and storing one metric ton of CO2, measured in currency (e.g., USD) per metric ton of CO2 captured.",\n  "Storage capacity": "The total amount of CO2 that can be stored in the chosen storage method, measured in metric tons.",\n  "Environmental impact": "The assessment of the carbon capture process on the environment, including factors such as water usage, land requirements, and p

In [28]:
# Evaluation
prompt = f'''
Objective: 
{objective}

Key metrics:
{res}

TasK:
Prioritize these key metrics of success by importance.

The output should be in JSON format: 
```json
{{
  "<insert metric>" : "<insert metric description>",
  etc.
}}

Respond only with the output, with no explanation or conversation. I expect a first-rate answer.
'''
res = json.loads(chat_openai(prompt, model="gpt-3.5-turbo")[0])
print(res)

Prompt messages:  [{'role': 'system', 'content': 'You are a helpful and educated carbon capture research consultant and an educated and helpful researcher and programmer. Answer as correctly, clearly, and concisely as possible.'}, {'role': 'user', 'content': '\nObjective: \nCarbon capture 1 Gt of CO2 per year ASAP\n\nKey metrics:\n{\n  "CO2 capture rate": "The rate at which CO2 is captured and removed from the atmosphere, measured in metric tons per year.",\n  "Energy consumption": "The amount of energy required for the carbon capture process, measured in kilowatt-hours per metric ton of CO2 captured.",\n  "Cost per metric ton of CO2 captured": "The cost incurred for capturing and storing one metric ton of CO2, measured in currency (e.g., USD) per metric ton of CO2 captured.",\n  "Storage capacity": "The total amount of CO2 that can be stored in the chosen storage method, measured in metric tons.",\n  "Environmental impact": "The assessment of the carbon capture process on the environm

In [29]:
file_name = f'metrics.json'

with open(file_name, 'w', encoding='utf-8') as f:
    json.dump(res, f, indent=4)

metrics = res
print(metrics)

{'CO2 capture rate': 'The rate at which CO2 is captured and removed from the atmosphere, measured in metric tons per year.', 'Feasibility': 'The evaluation of the technical and economic feasibility of implementing the carbon capture process at the desired scale.', 'Cost per metric ton of CO2 captured': 'The cost incurred for capturing and storing one metric ton of CO2, measured in currency (e.g., USD) per metric ton of CO2 captured.', 'Storage capacity': 'The total amount of CO2 that can be stored in the chosen storage method, measured in metric tons.', 'Permanence': 'The long-term stability and durability of the stored CO2, ensuring it remains securely stored and does not re-enter the atmosphere.', 'Energy consumption': 'The amount of energy required for the carbon capture process, measured in kilowatt-hours per metric ton of CO2 captured.', 'Environmental impact': 'The assessment of the carbon capture process on the environment, including factors such as water usage, land requirement

In [30]:
focus_metrics = dict(list(metrics.items())[:3])
focus_metrics

{'CO2 capture rate': 'The rate at which CO2 is captured and removed from the atmosphere, measured in metric tons per year.',
 'Feasibility': 'The evaluation of the technical and economic feasibility of implementing the carbon capture process at the desired scale.',
 'Cost per metric ton of CO2 captured': 'The cost incurred for capturing and storing one metric ton of CO2, measured in currency (e.g., USD) per metric ton of CO2 captured.'}

In [31]:
# Clarifying questions
prompt = f'''
Objective: 
{objective}

Key metrics:
{focus_metrics}

TasK:
Do you have any clarifying questions about the objective or metrics before you try to hypothesize a plan to achieve the objective? If so, ask them and answer them. If not, say "no clarifying questions". 
'''
res = chat_openai(prompt, model="gpt-3.5-turbo")[0]
print(res)

Prompt messages:  [{'role': 'system', 'content': 'You are a helpful and educated carbon capture research consultant and an educated and helpful researcher and programmer. Answer as correctly, clearly, and concisely as possible.'}, {'role': 'user', 'content': '\nObjective: \nCarbon capture 1 Gt of CO2 per year ASAP\n\nKey metrics:\n{\'CO2 capture rate\': \'The rate at which CO2 is captured and removed from the atmosphere, measured in metric tons per year.\', \'Feasibility\': \'The evaluation of the technical and economic feasibility of implementing the carbon capture process at the desired scale.\', \'Cost per metric ton of CO2 captured\': \'The cost incurred for capturing and storing one metric ton of CO2, measured in currency (e.g., USD) per metric ton of CO2 captured.\'}\n\nTasK:\nDo you have any clarifying questions about the objective or metrics before you try to hypothesize a plan to achieve the objective? If so, ask them and answer them. If not, say "no clarifying questions". \n'

### Decompose key factors of the problem

In [35]:
# Evaluation
prompt = f'''
Objective: 
{objective}

Key metrics:
{focus_metrics}

Task:
What are factors that are most related to the objective?

The output should be in JSON format: 
```json
{{
  "<insert factor>" : "<insert factor description>",
  etc.
}}

Respond only with the output, with no explanation or conversation. I expect a first-rate answer.
'''
res = chat_openai(prompt, model="gpt-3.5-turbo")[0]
print(res)

Prompt messages:  [{'role': 'system', 'content': 'You are a helpful and educated carbon capture research consultant and an educated and helpful researcher and programmer. Answer as correctly, clearly, and concisely as possible.'}, {'role': 'user', 'content': '\nObjective: \nCarbon capture 1 Gt of CO2 per year ASAP\n\nKey metrics:\n{\'CO2 capture rate\': \'The rate at which CO2 is captured and removed from the atmosphere, measured in metric tons per year.\', \'Feasibility\': \'The evaluation of the technical and economic feasibility of implementing the carbon capture process at the desired scale.\', \'Cost per metric ton of CO2 captured\': \'The cost incurred for capturing and storing one metric ton of CO2, measured in currency (e.g., USD) per metric ton of CO2 captured.\'}\n\nTask:\nWhat are factors that are most related to the objective?\n\nThe output should be in JSON format: \n```json\n{\n  "<insert factor>" : "<insert factor description>",\n  etc.\n}\n\nRespond only with the output

In [38]:
key_factors = json.loads(res)
key_factors

{'Availability of suitable carbon capture technologies': 'The existence and readiness of technologies that can effectively capture and remove CO2 from the atmosphere at the desired scale.',
 'Access to suitable storage sites': 'The availability of geologically suitable sites where captured CO2 can be safely and permanently stored without leakage.',
 'Investment and funding': 'Sufficient financial resources and investment to support the development, deployment, and operation of large-scale carbon capture projects.',
 'Policy and regulatory support': 'Supportive policies and regulations that incentivize and facilitate the implementation of carbon capture projects, including carbon pricing mechanisms and emissions reduction targets.',
 'Infrastructure and transportation': 'The presence of infrastructure and transportation networks to transport captured CO2 from capture sites to storage sites efficiently and safely.',
 'Public acceptance and social awareness': 'Positive public perception a

In [46]:
# Evaluation
prompt = f'''
Objective: 
{objective}

Key factors:
{key_factors}

Task:
Prioritize these key factors by how large and sensitive they are to achieving the objective.

The output should be in JSON format: 
```json
{{
  "<insert factor>" : "<insert factor description>",
  etc.
}}

Respond only with the output, with no explanation or conversation. I expect a first-rate answer.
'''
factors = json.loads(chat_openai(prompt, model="gpt-3.5-turbo")[0])
print(factors)

Prompt messages:  [{'role': 'system', 'content': 'You are a helpful and educated carbon capture research consultant and an educated and helpful researcher and programmer. Answer as correctly, clearly, and concisely as possible.'}, {'role': 'user', 'content': '\nObjective: \nCarbon capture 1 Gt of CO2 per year ASAP\n\nKey factors:\n{\'Availability of suitable carbon capture technologies\': \'The existence and readiness of technologies that can effectively capture and remove CO2 from the atmosphere at the desired scale.\', \'Access to suitable storage sites\': \'The availability of geologically suitable sites where captured CO2 can be safely and permanently stored without leakage.\', \'Investment and funding\': \'Sufficient financial resources and investment to support the development, deployment, and operation of large-scale carbon capture projects.\', \'Policy and regulatory support\': \'Supportive policies and regulations that incentivize and facilitate the implementation of carbon ca

In [47]:
focus_factors = dict(list(factors.items())[:2])
focus_factors

{'Availability of suitable carbon capture technologies': 'The existence and readiness of technologies that can effectively capture and remove CO2 from the atmosphere at the desired scale.',
 'Access to suitable storage sites': 'The availability of geologically suitable sites where captured CO2 can be safely and permanently stored without leakage.'}

### Current state analysis

In [None]:
# Evaluation
prompt = f'''
Objective: 
{objective}

Key metrics:
{key_metrics}

Key factors:
{key_factors}

Task:
Prioritize these key factors by how large and sensitive they are to achieving the objective.

The output should be in JSON format: 
```json
{{
  "<insert factor>" : "<insert factor description>",
  etc.
}}

Respond only with the output, with no explanation or conversation. I expect a first-rate answer.
'''
factors = json.loads(chat_openai(prompt, model="gpt-3.5-turbo")[0])
print(factors)

### Use LLM to generate crazy ideas with temperature = 1

In [6]:
# GPT template
prompt = f'''
Goal:
{objective}

Write 10 crazy ideas for novel impactful hypotheses and experiments to achieve the goal above.

The output should be in JSON format: 
```json
{{
  "<insert index>" : "<insert idea>",
  etc.
}}

Respond only with the output, with no explanation or conversation.
'''
ideas_json = json.loads(chat_openai_high_temp(prompt, model="gpt-3.5-turbo")[0])
print(ideas_json)

Prompt:  
Goal:
Carbon capture 1 Gt of CO2 per year ASAP

Write 10 crazy ideas for novel impactful hypotheses and experiments to achieve the goal above.

The output should be in JSON format: 
```json
{
  "<insert index>" : "<insert idea>",
  etc.
}

Respond only with the output, with no explanation or conversation.

Completion info:  {
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "message": {
        "content": "{\n  \"1\" : \"Develop a genetically modified plant species with enhanced carbon storage capabilities\",\n  \"2\" : \"Create large-scale artificial carbon sinks using bioengineered microorganisms\",\n  \"3\" : \"Design and deploy a network of large-scale ocean-based carbon capture systems\",\n  \"4\" : \"Utilize nanotechnology to develop highly efficient carbon capture materials\",\n  \"5\" : \"Explore the use of electrochemical processes for direct air capture of CO2\",\n  \"6\" : \"Implement large-scale reforestation efforts with fast-growing tr

In [11]:
file_name = f'ideas.json'

with open(file_name, 'w', encoding='utf-8') as f:
    json.dump(ideas_json, f, indent=4)

### Use evaluator to verify feasibility and reasonability of approach, with temperature = 0

8/8/23 Note: Rapid prototyping, so not including search yet to help massively with learning. Relying on building learning memory through creative ideation and critical feedback solely through LLMs for now.

In [9]:
# GPT template
# Improvement: it should cite direct quote facts, so that the direct quote and url can be retrieved in the answer.

prompt = f'''
Goal:
{objective}

Task:
You are an evaluator. Given the goal and the ideas, what are the most feasible ideas to achieve the goal? For each idea, outline both the proposition of why it is a feasible idea, and the opposition of what the idea's biggest weaknesses might be to make it not a feasible idea. Conclude if the idea is feasible. Use your existing knowledge to the best of your ability.

The output should be in JSON format: 
```json
{{
  "<insert idea index>" : {{
    "idea": "<insert idea>",
    "opposition":  "<insert reasons why the idea might not be a feasible idea to achieve the goal>",
    "proposition": "<insert reasons why the idea might be a feasible idea to achieve the goal>",
    "conclusion": "<insert yes or no if the idea is a feasible idea to achieve the goal, and then your reasoning>"
  }}, etc.
}}
```

Ideas:
{ideas_json}

Respond only with the output, with no explanation or conversation.
'''
evaluator_feedback = json.loads(chat_openai(prompt, model="gpt-3.5-turbo")[0])
print(evaluator_feedback)

Prompt:  
Goal:
Carbon capture 1 Gt of CO2 per year ASAP

Task:
You are an evaluator. Given the goal and the ideas, what are the most feasible ideas to achieve the goal? For each idea, outline both the proposition of why it is a feasible idea, and the opposition of what the idea's biggest weaknesses might be to make it not a feasible idea. Conclude if the idea is feasible. Use your existing knowledge to the best of your ability.

The output should be in JSON format: 
```json
{
  "<insert idea index>" : {
    "idea": "<insert idea>",
    "opposition":  "<insert reasons why the idea might not be a feasible idea to achieve the goal>",
    "proposition": "<insert reasons why the idea might be a feasible idea to achieve the goal>",
    "conclusion": "<insert yes or no if the idea is a feasible idea to achieve the goal, and then your reasoning>"
  }, etc.
}
```

Ideas:
{'1': 'Develop a genetically modified plant species with enhanced carbon storage capabilities', '2': 'Create large-scale art

In [10]:
file_name = f'evaluation.json'

with open(file_name, 'w', encoding='utf-8') as f:
    json.dump(evaluator_feedback, f, indent=4)

### Iterate with learnings memory, probably using RAG whenever it is fetched