# Inverse Alloy Design
Inverse prediction of material properties is a key challenge in materials science. For example, the dependence of the bulk modulus on the concentration of an alloy is typically evaluated by computing the bulk modulus for a given concentration. But the relevant material property is the bulk modulus, so we would like to predict the concentration which is required to achieve a specific bulk modulus. 

To demonstrate the application of LLMs for inverse alloy design, we extended the calculation of bulk moduli to complex alloys in solid solution. For computational efficiency we use 32-atom special quasi-random structures (SQS) to simulate solid solutions.

## Python Functions
Three python functions are defined, the `get_complex_alloy_bulk_structure()` function which creates the solid solution structure of the complex alloy by linear interpolation of the nearest neighbor distance, the `get_atom_dict_equilibrated_structure()` function which calculates the equilibrum volume and the `get_bulk_modulus()` function to compute the bulk modulus. All these functions are already included in the `LangSim` package to accelerate inverse alloy design.

In [1]:
from langsim.tools.simulation_atomistics import get_atom_dict_equilibrated_structure, get_bulk_modulus
from langsim.tools.simulation_complex_alloys import get_complex_alloy_bulk_structure

In [2]:
atoms_dict = get_complex_alloy_bulk_structure.invoke({
    "element_lst": ["Ag", "Cu", "Au"],
    "concentration_lst": [0.33, 0.33],
})



In [3]:
atoms_equilibrated_dict = get_atom_dict_equilibrated_structure.invoke({
    "atom_dict": atoms_dict, 
    "calculator_str": "emt",
})

       Step     Time          Energy          fmax
LBFGS:    0 00:21:19        1.007380        0.896836
LBFGS:    1 00:21:19        0.903343        0.780359
LBFGS:    2 00:21:19        0.544874        0.264546
LBFGS:    3 00:21:19        0.527220        0.258248
LBFGS:    4 00:21:19        0.509179        0.241072
LBFGS:    5 00:21:19        0.491373        0.153396
LBFGS:    6 00:21:19        0.480369        0.112176
LBFGS:    7 00:21:19        0.475807        0.107402
LBFGS:    8 00:21:19        0.472776        0.099382
LBFGS:    9 00:21:19        0.468296        0.091074
LBFGS:   10 00:21:19        0.464523        0.081476
LBFGS:   11 00:21:19        0.462265        0.074352
LBFGS:   12 00:21:19        0.460756        0.069380
LBFGS:   13 00:21:19        0.458932        0.062557
LBFGS:   14 00:21:19        0.456936        0.053164
LBFGS:   15 00:21:19        0.455497        0.049564
LBFGS:   16 00:21:19        0.454711        0.045882
LBFGS:   17 00:21:19        0.454131        0.03

In [4]:
get_bulk_modulus.invoke({
    "atom_dict": atoms_equilibrated_dict, 
    "calculator_str": "emt"
})

130.37777754206886

## Agent
The agent follows the same [custom agent](https://python.langchain.com/v0.1/docs/modules/agents/how_to/custom_agent/) langchain tutorial like the previous examples. This highlights that no complex modifications to the prompt are required.

In [5]:
from getpass import getpass

In [6]:
from langchain.agents import AgentExecutor
from langchain.agents.format_scratchpad.openai_tools import format_to_openai_tool_messages
from langchain.agents.output_parsers.openai_tools import OpenAIToolsAgentOutputParser
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_openai import ChatOpenAI

In [7]:
OPENAI_API_KEY = getpass(prompt='Enter your OpenAI Token:')
llm = ChatOpenAI(model="gpt-4o", temperature=0, openai_api_key=OPENAI_API_KEY)
tools = [get_complex_alloy_bulk_structure, get_atom_dict_equilibrated_structure, get_bulk_modulus]
prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are very powerful assistant, but don't know current events. For each query vailidate that it contains a chemical element and otherwise cancel.",
        ),
        ("user", "{input}"),
        MessagesPlaceholder(variable_name="agent_scratchpad"),
    ]
)
agent = (
    {
        "input": lambda x: x["input"],
        "agent_scratchpad": lambda x: format_to_openai_tool_messages(
            x["intermediate_steps"]
        ),
    }
    | prompt
    | llm.bind_tools(tools)
    | OpenAIToolsAgentOutputParser()
)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

Enter your OpenAI Token: ········


## Dialog
Finally, the agent is tasked to find the alloy concentration of an Copper Gold alloy which matches the bulk modulus of 145GPa within an error bound of 2GPa. In addition, it received the hint to use linear interpolation rather than random sampling or other strategies to find the desired concentration.  

In [8]:
lst = list(agent_executor.stream({"input": "Using linear interpolation find the concentration of an Copper Gold Alloy with a bulk modulus around 145 GPa with an error of plus or minus 2 GPa using the EMT simulation code. Validate your prediction by computing the bulk modulus and do not stop until you calculate the bulk modulus with the defined uncertainty."}))



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `get_complex_alloy_bulk_structure` with `{'element_lst': ['Cu'], 'concentration_lst': [1.0], 'number_of_atoms': 32, 'crystal_structure': 'fcc'}`
responded: Sure, let's start by validating the presence of chemical elements in your query. You mentioned Copper (Cu) and Gold (Au), which are valid chemical elements.

Next, we'll use linear interpolation to estimate the concentration of a Cu-Au alloy with a bulk modulus around 145 GPa using the EMT simulation code. We'll then validate this prediction by computing the bulk modulus and iterating until we achieve the desired accuracy.

### Step 1: Define the initial concentrations and compute their bulk moduli

We'll start with two initial concentrations:
1. Pure Copper (100% Cu, 0% Au)
2. Pure Gold (0% Cu, 100% Au)

Let's compute the bulk moduli for these initial concentrations using the EMT simulation code.

[0m[36;1m[1;3mnumbers=[13, 13, 13, 13, 13, 13, 13, 13, 13, 13

## Summary
This example highlights the application of generative models for materials science. While the three previous examples could also be addressed by computing the material properties for all unaries, store them in a database and provide them to the user when requested, this is no longer possible for the space of binary or even ternary alloys. Just like a student the LLM is capable to apply an abstract concept of linear interpolation to the prediction of the concentration related to a bulk modulus of 145GPa.