In [1]:
from langchain_openai import ChatOpenAI
from langchain.prompts import ChatPromptTemplate

In [10]:
simple_prompt_template = ChatPromptTemplate.from_messages([
    ('user',"""
You are required to extract material information from text provided below and ouput desired format which generally a list of dictionaries, and for each includes metadata and property information. Specifically, the metadata has structure as follows:
metadata: {{
    "composition": "%s",
    "label": "%s",
    "procecessing_kw": "%s"
    }}
e.g., metadata: {{
    "composition": "Mn0.2CoCrNi",
    "label": "A1",
    "procecessing_kw": ["annealed at 800C for 1hr", "cold rolled 50%"]
    }}
Note: processing_kw are associated with processing keywords (be succinct), in the experimental section as provided separated with property section, do not contains any processings involved in the property testing.
For property specific instruction: {property_instruction}

Synthesis section:
{synthesis_para}

Property section:
{property}
""")
]
)
phase_instruction = """If materials are synthesized with different parameters (e.g., temperature, duration, processing method), each should be considered a distinct sample. Extract phase information for each sample separately (if phase presented)."""
strength_instruction = """Extract all mechanical property relevant to ys, uts and strain from the text

Follow these rules:
- Material composition should be in the form of nominal chemical formula in atom percentage, e.g., "Mn0.2CoCrNi", not any descriptive phrases.
- Prioritize table values over text if there is a conflict. 
- If the value provided is a range, for example, "from 200 MPa to 300 MPa", extract it as "200-300 MPa".
- If the value is given as "greater than" or "less than", for example, "greater than 400 MPa", extract it as ">400 MPa".
- If the value is given as "approximately" or "around", for example, "approximately 250 MPa", extract it as "≈250 MPa".
- Otherwise, extract the value as it is."""

In [11]:
from json_schemas import PhaseRecords, StrengthRecords

model = ChatOpenAI(model='gpt-4.1', temperature=0)
def build_chain(prompt, pydantic_model):
    return prompt | model.with_structured_output(pydantic_model, method='json_schema')
phase_chain = build_chain(simple_prompt_template, PhaseRecords)
strength_chain = build_chain(simple_prompt_template, StrengthRecords)

In [12]:
synthesis_para = """The HEA with a nominal composition of V10Cr15Mn5Fe35Co10Ni25 (at%) was fabricated using vacuum induction melting furnace using pure elements of V, Cr, Mn, Fe, Co, and Ni (purity >99.9%). The as-cast sample was subjected to homogenization heat treatment at 1100 °C for 6 h under an Ar atmosphere, followed by water quenching. The homogenized sample was cold rolled through multiple passes with a final rolling reduction ratio of ≈79% (from 6.2 to 1.3 mm). The disk-shaped samples (10 mm diameter) were prepared from the cold rolled sheet using electro-discharge machining. The disk samples were annealed at two different conditions (900 °C for 10 min and 1100 °C for 60 min) to obtain microstructure with fine grains and coarse grains, respectively. Finally, the HPT process was carried out on the annealed samples at different turns (N = 1/4, 1, and 5) using a pressure of 6 GPa and a rotation rate of 1 revolution per minute (rpm)."""
phase_property = """Figure shows the XRD patterns of fine-grained (FG) and coarse-grained (CG) samples before HPT processing (N = 0) and after five turns of HPT processing (N = 5). The XRD patterns of the annealed sample show a single FCC phase with no secondary phases, which confirms the thermodynamic calculations reported earlier for this HEA. The XRD patterns of HPT-processed samples also show a single FCC phase, indicating no evidence of phase transformation after HPT. Also, HPT processing leads to a significant peak broadening of XRD patterns, indicating grain refinement and lattice strain.
Figure shows the bright-field (BF) TEM images of the FG sample after 1/4 turn (Figure a) and 1 turn (Figure b) of HPT processing. The BF TEM image of the FG sample after 1/4 turn shows a high density of tangled dislocations (Figure a). The fact that accumulation of dislocations can occur within the grains could be because of the planar dislocation-slip modes.. The selected area electron diffraction (SAED) pattern shown in the inset of Figure a shows only FCC phase diffraction spots (taken along [011] zone axis), confirming the presence of a single phase. The microstructure of the FG sample after one turn of HPT (Figure b) shows significant grain refinement with an average size of ≈100 nm. Correspondingly, the SAED pattern of one turn shows the concentric FCC ring pattern, indicating nanograins with high-angle grain boundaries. The strain contrast observed in nanograins can be related to the accumulation of dislocations."""
strength_property = """The engineering stress versus strain curves of the FG and CG samples before and after HPT processing at different turns (1/4, 1, and 5 turns) are shown in Figure . The tensile curves indicate that samples in annealed condition exhibit low yield strength (YS) and high elongation to failure, whereas the HPT processing led to an enhancement in YS accompanied by a reduction in the elongation to failure. The strength-ductility trade-off is in agreement with the classic mechanical response of ultra-fine grained (UFG) metallic materials processed by SPD techniques. Table summarizes the relative data for the engineering YS, ultimate tensile strength (UTS), and total elongation to failure (δ) for both the FG and CG samples. The CG sample in annealed condition shows a YS of ≈230 MPa, and the FG sample shows a relatively higher YS of ≈430 MPa. The higher YS observed in the FG sample can be attributed to the Hall-Petch strengthening effect. The UTSs of the FG and CG samples are 751 and 532 MPa, respectively, and the CG sample presents slightly higher total elongation to failure in comparison with the FG sample (CG ≈58% and FG ≈48%).
Figure 8. Engineering stress-strain curves of the FG and CG samples before (annealed) and after HPT processing with increasing numbers of turns.
Table 1. YS and UTS, and total elongation to failure (δ) of the FG and CG samples of V10Cr15Mn5Fe35Co10Ni25 HEA before (N = 0) and after HPT processing with increasing the number of turns
Number of HPT turns [N]	Sample	YS [MPa]	UTS [MPa]	δ [%]
0	FG	430	720	48.1
CG	230	532	57.6
1/4	FG	1120	1447	15.9
CG	1270	1502	17.3
1	FG	1630	1813	12.9
CG	1660	1854	14.3
5	FG	1940	1986	6.0
CG	1950	2015	6.3
After 1/4 turn of HPT processing, the UTS increases significantly for both FG ( ≈ 1.4 GPa) and CG ( ≈ 1.5 GPa) samples. The strength increases further with increasing the number of turns, and the increase in strength is accompanied by a decrease in total elongation to failure. After five turns, the tensile strength of FG and CG samples reached similar values ( ≈ 2 GPa) with a total elongation to failure of ≈6%. The CG sample presents relatively higher YS and UTS as compared with the FG sample after 1/4 turn and 1 turn of HPT. This indicates that the strength enhancement in the CG sample is higher than that of the FG sample, considering the initial strength of the FG and CG samples (before HPT)."""

In [21]:
r_phase = phase_chain.invoke(
    {'synthesis_para': synthesis_para, 'property': phase_property, 'property_instruction': phase_instruction}
)

In [None]:
r_strength = strength_chain.invoke(
    {'synthesis_para': synthesis_para, 'property': strength_property, 'property_instruction': strength_instruction}
)

In [27]:
r_phase.records

[PhaseRecord(metadata=MetaData(composition='V10Cr15Mn5Fe35Co10Ni25', label='FG annealed 900C 10min', processing_kw=['vacuum induction melting', 'homogenized 1100C 6h Ar', 'water quenched', 'cold rolled 79%', 'EDM disk', 'annealed 900C 10min']), phase=Phase(phases=['FCC'])),
 PhaseRecord(metadata=MetaData(composition='V10Cr15Mn5Fe35Co10Ni25', label='CG annealed 1100C 60min', processing_kw=['vacuum induction melting', 'homogenized 1100C 6h Ar', 'water quenched', 'cold rolled 79%', 'EDM disk', 'annealed 1100C 60min']), phase=Phase(phases=['FCC'])),
 PhaseRecord(metadata=MetaData(composition='V10Cr15Mn5Fe35Co10Ni25', label='FG annealed 900C 10min HPT N=1/4', processing_kw=['vacuum induction melting', 'homogenized 1100C 6h Ar', 'water quenched', 'cold rolled 79%', 'EDM disk', 'annealed 900C 10min', 'HPT 6GPa 1/4 turn']), phase=Phase(phases=['FCC'])),
 PhaseRecord(metadata=MetaData(composition='V10Cr15Mn5Fe35Co10Ni25', label='FG annealed 900C 10min HPT N=1', processing_kw=['vacuum induction 

In [28]:
r_strength.records

[StrengthRecord(metadata=MetaData(composition='V10Cr15Mn5Fe35Co10Ni25', label='FG, annealed', processing_kw=['homogenized at 1100C 6h Ar', 'cold rolled 79%', 'annealed 900C 10min']), strength=Strength(ys=430.0, uts=720.0, strain=48.1, temperature=None, test_type='tensile')),
 StrengthRecord(metadata=MetaData(composition='V10Cr15Mn5Fe35Co10Ni25', label='CG, annealed', processing_kw=['homogenized at 1100C 6h Ar', 'cold rolled 79%', 'annealed 1100C 60min']), strength=Strength(ys=230.0, uts=532.0, strain=57.6, temperature=None, test_type='tensile')),
 StrengthRecord(metadata=MetaData(composition='V10Cr15Mn5Fe35Co10Ni25', label='FG, HPT 1/4 turn', processing_kw=['homogenized at 1100C 6h Ar', 'cold rolled 79%', 'annealed 900C 10min', 'HPT 1/4 turn 6GPa 1rpm']), strength=Strength(ys=1120.0, uts=1447.0, strain=15.9, temperature=None, test_type='tensile')),
 StrengthRecord(metadata=MetaData(composition='V10Cr15Mn5Fe35Co10Ni25', label='CG, HPT 1/4 turn', processing_kw=['homogenized at 1100C 6h A

In [29]:
r_phase.records[0].model_dump()

{'metadata': {'composition': 'V10Cr15Mn5Fe35Co10Ni25',
  'label': 'FG annealed 900C 10min',
  'processing_kw': ['vacuum induction melting',
   'homogenized 1100C 6h Ar',
   'water quenched',
   'cold rolled 79%',
   'EDM disk',
   'annealed 900C 10min']},
 'phase': {'phases': ['FCC']}}

In [38]:
records_metadata = [record.metadata.model_dump() for record in r_phase.records] + [record.metadata.model_dump() for record in r_strength.records]

In [36]:
records_metadata[:2]

[{'composition': 'V10Cr15Mn5Fe35Co10Ni25',
  'label': 'FG annealed 900C 10min',
  'processing_kw': ['vacuum induction melting',
   'homogenized 1100C 6h Ar',
   'water quenched',
   'cold rolled 79%',
   'EDM disk',
   'annealed 900C 10min']},
 {'composition': 'V10Cr15Mn5Fe35Co10Ni25',
  'label': 'CG annealed 1100C 60min',
  'processing_kw': ['vacuum induction melting',
   'homogenized 1100C 6h Ar',
   'water quenched',
   'cold rolled 79%',
   'EDM disk',
   'annealed 1100C 60min']}]

In [None]:
records_metadata[6:9]

[{'composition': 'V10Cr15Mn5Fe35Co10Ni25',
  'label': 'FG, annealed',
  'processing_kw': ['homogenized at 1100C 6h Ar',
   'cold rolled 79%',
   'annealed 900C 10min']},
 {'composition': 'V10Cr15Mn5Fe35Co10Ni25',
  'label': 'CG, annealed',
  'processing_kw': ['homogenized at 1100C 6h Ar',
   'cold rolled 79%',
   'annealed 1100C 60min']},
 {'composition': 'V10Cr15Mn5Fe35Co10Ni25',
  'label': 'FG, HPT 1/4 turn',
  'processing_kw': ['homogenized at 1100C 6h Ar',
   'cold rolled 79%',
   'annealed 900C 10min',
   'HPT 1/4 turn 6GPa 1rpm']}]

In [None]:
example_records = [
[{'composition': 'V10Cr15Mn5Fe35Co10Ni25',
  'label': 'FG, annealed',
  'processing_kw': ['homogenized at 1100C 6h Ar',
   'cold rolled 79%',
   'annealed 900C 10min']},
 {'composition': 'V10Cr15Mn5Fe35Co10Ni25',
  'label': 'CG, annealed',
  'processing_kw': ['homogenized at 1100C 6h Ar',
   'cold rolled 79%',
   'annealed 1100C 60min']},
 {'composition': 'V10Cr15Mn5Fe35Co10Ni25',
  'label': 'FG, HPT 1/4 turn',
  'processing_kw': ['homogenized at 1100C 6h Ar',
   'cold rolled 79%',
   'annealed 900C 10min',
   'HPT 1/4 turn 6GPa 1rpm']}],
   [{'composition': 'V10Cr15Mn5Fe35Co10Ni25',
  'label': 'FG annealed 900C 10min',
  'processing_kw': ['vacuum induction melting',
   'homogenized 1100C 6h Ar',
   'water quenched',
   'cold rolled 79%',
   'EDM disk',
   'annealed 900C 10min']},
 {'composition': 'V10Cr15Mn5Fe35Co10Ni25',
  'label': 'CG annealed 1100C 60min',
  'processing_kw': ['vacuum induction melting',
   'homogenized 1100C 6h Ar',
   'water quenched',
   'cold rolled 79%',
   'EDM disk',
   'annealed 1100C 60min']}]
]