In [1]:
import datetime
import os
from mdagent import MDAgent
import matplotlib.pyplot as plt

In [2]:
#todays date and time
now = datetime.datetime.now()
date = now.strftime("%Y-%m-%d")
print("date and time:",date)
time = now.strftime("%H:%M:%S")
print("time:",time)

date and time: 2024-07-20
time: 21:57:56


In [3]:
prompt1 = "Simulate pdb  1MBN at two different temperatures: 300K, 400K for 1ns seconds each. Plot RMSD of both over time, and compare the final secondary structures at the end of the simulations."
llm_var = "gpt-4o-2024-05-13"
tools = "all"
agent = MDAgent(agent_type="Structured", model=llm_var, top_k_tools=tools)

In [4]:
print("LLM: ",agent.llm.model_name,"\nTemperature: ",agent.llm.temperature)

LLM:  gpt-4o-2024-05-13 
Temperature:  0.1


In [5]:
agent.run(prompt1)

Thought: To address the question, I need to perform the following steps:
1. Download the PDB file for 1MBN.
2. Set up and run two separate simulations for 1MBN at 300K and 400K for 1ns each.
3. Compute and plot the RMSD over time for both simulations.
4. Compare the final secondary structures at the end of the simulations.

First, I will download the PDB file for 1MBN.

Action:
```
{
  "action": "PDBFileDownloader",
  "action_input": "1MBN"
}
```PDB file found with this ID: 1MBN


2024-07-20 21:57:59.570 
  command:

    streamlit run /Users/jorgemedina/miniforge3/envs/mdagent2/lib/python3.11/site-packages/ipykernel_launcher.py [ARGUMENTS]


Next, I need to set up and run two separate simulations for 1MBN at 300K and 400K for 1ns each. I will start with the simulation at 300K.

Action:
```
{
  "action": "SetUpandRunFunction",
  "action_input": {
    "pdb_id": "1MBN_215759",
    "forcefield_files": [],
    "save": true,
    "system_params": {
      "nonbondedMethod": "NoCutoff",
      "constraints": "None",
      "rigidWater": false,
      "solvate": false
    },
    "integrator_params": {
      "integrator_type": "LangevinMiddle",
      "Temperature": "300 * kelvin",
      "Friction": "1.0 / picoseconds",
      "Timestep": "0.002 * picoseconds"
    },
    "simulation_params": {
      "Ensemble": "NVT",
      "Number of Steps": 500000,
      "record_interval_steps": 5000,
      "record_params": ["step", "potentialEnergy", "temperature"]
    }
  }
}
```
Setting default forcefields v2
Building system...
About to create system...
Attempts at creating system: 0/3
adding system without solvent
Trying to add component to Forcefie

('The simulations at both 300K and 400K have been successfully completed, and the RMSD over time has been computed and plotted for both temperatures. Additionally, the secondary structures at the end of both simulations have been analyzed using DSSP.\n\nHere are the results:\n\n### RMSD Plots\n- **300K Simulation**: The RMSD plot for the 300K simulation is saved with plot ID `fig0_001221`.\n- **400K Simulation**: The RMSD plot for the 400K simulation is saved with plot ID `fig0_001223`.\n\n### Secondary Structure Comparison\n- **300K Simulation**:\n  - Helix: 10320\n  - Strand: 0\n  - Coil: 4980\n\n- **400K Simulation**:\n  - Helix: 9629\n  - Strand: 0\n  - Coil: 5671\n\nThe results indicate that at 300K, the protein maintains a higher number of helical structures compared to the simulation at 400K, where there is an increase in coil structures. This suggests that the higher temperature leads to a partial unfolding of the protein, reducing the helical content and increasing the coil re

In [6]:
#print final date and time
now = datetime.datetime.now()
date = now.strftime("%Y-%m-%d")
print("date and time:",date)
time = now.strftime("%H:%M:%S")
print("time:",time)

date and time: 2024-07-21
time: 00:12:41


In [7]:
registry = agent.path_registry
paths_and_descriptions = registry.list_path_names_and_descriptions()
print("\n".join(paths_and_descriptions.split(",")))

Files found in registry: 1MBN_215759: PDB file downloaded from RSCB
 PDBFile ID: 1MBN_215759
 1MBN_215807: Cleaned File:  Removed Heterogens
 and Water Removed.  Replaced Nonstandard Residues. Added Hydrogens at pH 7.0. Missing Atoms Added and replaces nonstandard residues. 
 top_sim0_215809: Initial positions for simulation sim0_215809
 sim0_215809: Basic Simulation of Protein 1MBN_215807
 rec0_215810: Simulation trajectory for protein 1MBN_215807 and simulation sim0_215809
 rec1_215810: Simulation state log for protein 1MBN_215807 and simulation sim0_215809
 rec2_215810: Simulation pdb frames for protein 1MBN_215807 and simulation sim0_215809
 top_sim0_232350: Initial positions for simulation sim0_232350
 sim0_232350: Basic Simulation of Protein 1MBN_215807
 rec0_232351: Simulation trajectory for protein 1MBN_215807 and simulation sim0_232350
 rec1_232351: Simulation state log for protein 1MBN_215807 and simulation sim0_232350
 rec2_232351: Simulation pdb frames for protein 1MBN_2158

In [None]:
#plotting rmsd of both simulations
from IPython.display import Image
rmsd1ID = 'fig0_001223'
rmsd2ID = 'fig0_001221'
path1 = registry.get_mapped_path(rmsd1ID)
path2 = registry.get_mapped_path(rmsd2ID)

Image(filename=path1)



In [None]:
Image(filename=path2)

In [None]:
#secondary structure
import mdtraj as md

traj1 = "/Users/samanthacox/Desktop/md-agent/ckpt/ckpt_116/records/TOP_sim0_082254_1MBN_082248_20240718_082255.pdb"
traj = md.load(traj1)
top = traj.topology
number_of_chains = top.n_chains
secondary_structure = md.compute_dssp(traj,simplified=True)
print("Number of residues in chains: ",number_of_chains)
print("Number of residues in sheets: ",len([i for i in secondary_structure[0] if i == 'E']))
print("Number of residues in helices: ",len([i for i in secondary_structure[0] if i == 'H']))
print("Number of residues in coils: ",len([i for i in secondary_structure[0] if i == 'C']))



In [None]:
traj2="/Users/samanthacox/Desktop/md-agent/ckpt/ckpt_116/records/TOP_sim0_095917_1MBN_082248_20240718_095918.pdb"
traj = md.load(traj1)
top = traj.topology
number_of_chains = top.n_chains
secondary_structure = md.compute_dssp(traj,simplified=True)
print("Number of residues in chains: ",number_of_chains)
print("Number of residues in sheets: ",len([i for i in secondary_structure[0] if i == 'E']))
print("Number of residues in helices: ",len([i for i in secondary_structure[0] if i == 'H']))
print("Number of residues in coils: ",len([i for i in secondary_structure[0] if i == 'C']))


# Experiment Result:
### Completed without Exception or TimeOut Errors  ✅
### Attempted all necessary steps ✅
### Logic make sense  ✅
### Correct Answer   ✅
