# SciAgents
## Automating scientific discovery through multi-agent intelligent graph reasoning

#### Alireza Ghafarollahi, Markus J. Buehler, MIT, 2024 mbuehler@MIT.EDU

In [7]:
'''
!git clone https://github.com/lamm-mit/SciAgentsDiscovery.git
%cd SciAgentsDiscovery
!pip install -e .
'''

'\n!git clone https://github.com/lamm-mit/SciAgentsDiscovery.git\n%cd SciAgentsDiscovery\n!pip install -e .\n'

In [None]:
import os

OpenAI_key=''
os.environ['OPENAI_API_KEY']=OpenAI_key

SemanticScholar_api_key = ''
os.environ['SEMANTIC_SCHOLAR_API_KEY']=SemanticScholar_api_key

data_dir_output='./graph_giant_component_LLMdiscovery_example/'

In [None]:
from huggingface_hub import hf_hub_download

graph_name = 'large_graph_simple_giant.graphml'
hf_hub_download(
    repo_id='lamm-mit/bio-graph-1K',
    filename=graph_name,
    local_dir='./graph_giant_component'
)

from huggingface_hub import hf_hub_download
embedding_name = 'embeddings_simple_giant_ge-large-en-v1.5.pkl'
hf_hub_download(
    repo_id='lamm-mit/bio-graph-1K',
    filename=embedding_name,
    local_dir='./graph_giant_component'
)

"\nfrom huggingface_hub import hf_hub_download\n\ngraph_name = 'large_graph_simple_giant.graphml'\nhf_hub_download(\n    repo_id='lamm-mit/bio-graph-1K',\n    filename=graph_name,\n    local_dir='./graph_giant_component'\n)\n\nfrom huggingface_hub import hf_hub_download\nembedding_name = 'embeddings_simple_giant_ge-large-en-v1.5.pkl'\nhf_hub_download(\n    repo_id='lamm-mit/bio-graph-1K',\n    filename=embedding_name,\n    local_dir='./graph_giant_component'\n)\n"

In [2]:
from ScienceDiscovery import *
make_dir_if_needed(data_dir_output)

100%|██████████| 33159/33159 [00:00<00:00, 221168.48it/s]
100%|██████████| 48753/48753 [00:00<00:00, 94297.90it/s]


'Directory already exists.'

### Setting up OpenAI GPT model for the LLM

In [11]:
default_generate_OpenAIGPT = partial(
    generate_OpenAIGPT,
    openai_api_key=OpenAI_key,
    #gpt_model='gpt-4-turbo',
    gpt_model='gpt-4o',
    temperature=0.2,
    max_tokens=2048,
)

## Research idea generation using the non-automated multi-agent model

In [12]:
research_generation(G=G, 
                    embedding_tokenizer=embedding_tokenizer,
                    embedding_model=embedding_model,
                    node_embeddings=node_embeddings,
                    generate=default_generate_OpenAIGPT,
                    generate_graph_expansion=default_generate_OpenAIGPT,
                    randomness_factor=0.2, num_random_waypoints=4,shortest_path=False,
                    second_hop=False, data_dir=data_dir_output, save_files=False, verbatim=True,
                    # keyword_1='energy-intensive', keyword_2='protein')
                    keyword_1='agents', keyword_2='automatic driving')

>>> Selected nodes: agents and automatic driving
Random walk to get path: agents and automatic driving
Original:  agents --> automatic driving
Selected:  agent --> automation
Done random walk to get path
Path: ['agent', 'reinforcement', 'mechanical properties', 'well-studied structural proteins', 'mechanical properties', 'acrylic acid treatment', 'mechanical properties', 'porosity', 'mechanical properties', 'material y', 'mechanical properties', 'materials science', 'biomimetic 3d printing', 'robotics', 'automation']
agent -- undergoes -- reinforcement -- significantly improves through effective stress transfer -- mechanical properties -- differ from -- well-studied structural proteins -- differ from -- mechanical properties -- changes -- acrylic acid treatment -- changes -- mechanical properties -- are strongly related to -- porosity -- are strongly related to -- mechanical properties -- did not significantly impact -- material y -- did not significantly impact -- mechanical propertie

{"hypothesis": "The integration of biomimetic 3D printing with reinforced agents that undergo stress transfer can create robotic components with mechanical properties surpassing those of natural structural proteins, by optimizing porosity and acrylic acid treatment to enhance automation capabilities.", "outcome": "The research is expected to yield robotic components with tensile strengths of up to 150 MPa and elasticity similar to elastin, with a porosity of less than 5% to ensure optimal strength and flexibility. Acrylic acid treatment will enhance surface properties, leading to improved stress transfer and durability.", "mechanisms": "The reinforcement process involves embedding carbon nanotubes as agents within a polymer matrix, which undergoes stress transfer to enhance mechanical properties. Acrylic acid treatment modifies the surface chemistry, increasing compatibility between the matrix and reinforcing agents. Porosity is controlled through 3D printing parameters, ensuring minimal void spaces to maximize mechanical performance.", "design_principles": "1. Utilize carbon nanotubes as reinforcing agents due to their high tensile strength and ability to undergo effective stress transfer. 2. Apply acrylic acid treatment to enhance surface interactions between the polymer matrix and reinforcing agents. 3. Optimize 3D printing parameters to control porosity, targeting less than 5% void spaces. 4. Mimic the hierarchical structure of natural proteins like collagen to achieve superior mechanical properties. 5. Integrate sensors within the printed structures to enable real-time monitoring and automation.", "unexpected_properties": "The resulting material may exhibit self-healing properties due to the dynamic interactions between the acrylic acid-treated surfaces and the polymer matrix. Additionally, the material could demonstrate enhanced thermal stability, allowing it to function in a wider range of environmental conditions.", "comparison": "Compared to traditional materials like steel or aluminum, the proposed material offers a higher strength-to-weight ratio and greater flexibility. Unlike conventional 3D printed polymers, this material will have enhanced mechanical properties due to the optimized porosity and reinforcement strategies. Compared to natural structural proteins, it offers superior durability and adaptability for robotic applications.", "novelty": "This research advances the field by combining biomimetic 3D printing with novel reinforcement and surface treatment techniques to create materials with unprecedented mechanical properties. It bridges the gap between natural and synthetic materials, offering new possibilities for automation and robotics, and sets a new standard for material design in engineering applications."}

agents ----> automatic driving


0it [00:00, ?it/s]

### Expanded Hypothesis

The integration of biomimetic 3D printing with reinforced agents that undergo stress transfer can create robotic components with mechanical properties surpassing those of natural structural proteins. This is achieved by optimizing 

### Expanded Outcome

The proposed research aims to develop robotic components with mechanical properties that surpass those of natural structural proteins by leveraging biomimetic 3D printing techniques. The target tensile strength of up to 150 MPa is amb

### Expanded Mechanisms

The reinforcement process described involves the integration of carbon nanotubes (CNTs) into a polymer matrix, which is a critical aspect of enhancing the mechanical properties of the resulting composite material. The effectiveness

### Expanded Design Principles

1. **Utilize Carbon Nanotubes as Reinforcing Agents:**
   - **Rationale:** Carbon nanotubes (CNTs) are selected for their exceptional tensile strength, which can exceed 60 GPa, and their Young's modulus, which can reach up t

### Expanded Unexpected Properties

The initial hypothesis suggests that the material may exhibit self-healing properties and enhanced thermal stability. To critically assess and expand upon these claims, we must delve into the underlying mechanisms and pr

### Expanded Comparison

The proposed biomimetic 3D-printed material, reinforced with carbon nanotubes (CNTs) and treated with acrylic acid, presents several advantages over traditional materials such as steel, aluminum, and conventional 3D-printed polymer

### Expanded Novelty

The proposed research introduces a transformative approach to material design by integrating biomimetic 3D printing with advanced reinforcement and surface treatment techniques. This approach is novel in several key aspects:

1. **Int

---------------------------------------------


The most impactful scientific question that can be tackled with molecular modeling from this document is: **How does the acrylic acid treatment enhance the interfacial bonding between carbon nanotubes (CNTs) and the polymer matrix, and what is the impact of this enhancement on the stress transfer efficiency and overall mechanical properties of the composite material?**

### Key Steps to Set Up and Conduct Molecular Modeling and Simulation:

1. **Define the System and Objectives:**
   - **Objective:** To understand the molecular interactions between acrylic acid-treated polymer surfaces and CNTs, and how these interactions influence stress transfer and mechanical properties.
   - **System Components:** The system will include a polymer matrix (e.g., poly(lactic acid) or polycaprolactone), functionalized CNTs, and acrylic acid molecules.

2. **Select Appropriate Molecular Modeling Techniques:**
   - **Molecular Dynamics (MD) Simulations:** Use MD simulations to model the dynamic interactions at the molecular level. This will help in visualizing how acrylic acid functional groups interact with CNTs and the polymer matrix.
   - **Density Functional Theory (DFT):** Employ DFT calculations to accurately predict the electronic structure and bonding characteristics of the acrylic acid-CNT-polymer interface.

3. **Prepare the Molecular Models:**
   - **Polymer Matrix:** Construct a representative segment of the polymer matrix, ensuring it reflects the typical chain length and cross-linking density.
   - **Functionalized CNTs:** Model CNTs with appropriate functional groups (e.g., carboxyl or hydroxyl groups) introduced through functionalization.
   - **Acrylic Acid:** Include acrylic acid molecules with carboxyl groups available for interaction with both the polymer and CNTs.

4. **Set Up Simulation Parameters:**
   - **Force Fields:** Choose suitable force fields for MD simulations, such as COMPASS or OPLS-AA, which can accurately describe the interactions between organic molecules and CNTs.
   - **Simulation Environment:** Define the simulation box dimensions and apply periodic boundary conditions to mimic an infinite system.
   - **Temperature and Pressure:** Set the simulation temperature and pressure to reflect typical processing or operating conditions (e.g., room temperature and atmospheric pressure).

5. **Conduct Simulations:**
   - **Equilibration:** Perform energy minimization and equilibration runs to stabilize the system before production simulations.
   - **Production Runs:** Conduct long-duration MD simulations to observe the interaction dynamics and stress transfer mechanisms. Monitor key parameters such as bond formation, interaction energies, and stress distribution.

6. **Analyze Results:**
   - **Interfacial Bonding:** Analyze the formation and stability of hydrogen bonds or covalent bonds between acrylic acid and CNTs/polymer. Use radial distribution functions (RDF) and bond angle distributions to quantify interactions.
   - **Stress Transfer Efficiency:** Evaluate the stress transfer efficiency by applying external forces and measuring the response of the composite. Calculate stress-strain curves and identify the role of interfacial interactions in load distribution.
   - **Mechanical Properties:** Use the simulation data to estimate mechanical properties such as Young's modulus, tensile strength, and toughness.

7. **Unique Aspects of the Planned Work:**
   - **Dynamic Bonding Analysis:** Focus on the dynamic nature of the acrylic acid interactions, exploring reversible bonding mechanisms that could contribute to self-healing properties.
   - **Multi-Scale Modeling:** Integrate molecular-level insights with continuum-scale models (e.g., finite element analysis) to predict macroscopic mechanical behavior.
   - **Comparative Studies:** Conduct comparative simulations with untreated and differently treated interfaces to highlight the specific benefits of acrylic acid treatment.

By addressing this scientific question through molecular modeling, the research can provide detailed insights into the molecular mechanisms underlying the enhanced mechanical properties of the composite material, guiding further experimental and theoretical developments.

The most impactful scientific question that can be tackled with synthetic biology from this document is:

**"How can synthetic biology be used to engineer biological systems that mimic the hierarchical structure and mechanical properties of natural structural proteins, such as collagen and elastin, to enhance the performance of biomimetic 3D-printed materials?"**

### Key Steps to Set Up and Conduct Experimental Work:

1. **Define the Target Properties and Functions:**
   - **Objective:** Identify the specific mechanical properties and structural features of natural proteins like collagen and elastin that are desired in the synthetic material.
   - **Parameters:** Focus on tensile strength, elasticity, hierarchical structure, and self-healing capabilities.

2. **Design Synthetic Biological Constructs:**
   - **Gene Selection:** Identify and select genes responsible for the production of structural proteins with desired properties. This may include genes encoding for collagen, elastin, or other relevant proteins.
   - **Synthetic Pathways:** Design synthetic pathways to express these proteins in a host organism, such as bacteria or yeast, using synthetic biology tools like CRISPR/Cas9 for precise genetic modifications.

3. **Host Organism Engineering:**
   - **Strain Development:** Engineer microbial strains (e.g., E. coli, yeast) to optimize the production of target proteins. This involves modifying metabolic pathways to enhance protein yield and stability.
   - **Promoter Optimization:** Use synthetic promoters to control the expression levels of the target proteins, ensuring efficient production and minimizing metabolic burden on the host.

4. **Protein Production and Purification:**
   - **Cultivation:** Grow the engineered strains under controlled conditions to maximize protein expression. Optimize parameters such as temperature, pH, and nutrient availability.
   - **Purification:** Develop purification protocols to isolate the target proteins, using techniques like affinity chromatography or precipitation methods.

5. **Material Fabrication and Characterization:**
   - **Biomimetic Assembly:** Use the purified proteins to fabricate biomimetic materials. This could involve techniques like electrospinning or layer-by-layer assembly to replicate the hierarchical structure of natural proteins.
   - **Characterization:** Employ techniques such as scanning electron microscopy (SEM), atomic force microscopy (AFM), and tensile testing to evaluate the structural and mechanical properties of the fabricated materials.

6. **Integration with Synthetic Polymers:**
   - **Composite Formation:** Combine the biologically derived proteins with synthetic polymers, such as those reinforced with carbon nanotubes, to create hybrid materials with enhanced properties.
   - **Surface Modification:** Apply treatments like acrylic acid to improve interfacial bonding between the biological and synthetic components.

7. **Functional Testing and Optimization:**
   - **Mechanical Testing:** Conduct comprehensive mechanical testing to assess tensile strength, elasticity, and durability. Use dynamic mechanical analysis (DMA) to evaluate viscoelastic properties.
   - **Self-Healing Assessment:** Test the self-healing capabilities of the material by inducing controlled damage and measuring recovery of mechanical properties.

8. **Iterative Design and Improvement:**
   - **Feedback Loop:** Use data from characterization and testing to refine the synthetic biology constructs and material fabrication processes. Implement a feedback loop to iteratively improve material performance.

### Unique Aspects of the Planned Work:

- **Interdisciplinary Approach:** This work uniquely combines synthetic biology with materials science and engineering, leveraging the strengths of each field to create advanced biomimetic materials.
- **Hierarchical Design:** The focus on replicating the hierarchical structure of natural proteins is a novel aspect, aiming to achieve superior mechanical properties through biomimicry.
- **Integration of Biological and Synthetic Components:** The creation of hybrid materials that integrate biologically derived proteins with synthetic polymers and nanomaterials is a distinctive feature, offering a new paradigm in material design.
- **Real-Time Monitoring:** Incorporating sensors within the material to monitor mechanical performance and environmental conditions in real-time is an innovative approach, enhancing the material's functionality and adaptability.

By addressing this scientific question through the outlined experimental work, the research could lead to significant advancements in the development of materials for robotics, automation, and other high-performance applications.