In [1]:
import datetime
import os
from mdagent import MDAgent
import matplotlib.pyplot as plt

In [2]:
prompt22 = "Download and clean protein 1A3N"
llm_var = "gpt-4o-2024-05-13"
tools = "all"
agent = MDAgent(agent_type="Structured", model=llm_var, top_k_tools=tools)
now = datetime.datetime.now()
date = now.strftime("%Y-%m-%d")
print("date and time:",date)
time = now.strftime("%H:%M:%S")
print("time:",time)
print("LLM: ",agent.llm.model_name,"\nTemperature: ",agent.llm.temperature)

date and time: 2024-07-16
time: 21:53:11
LLM:  gpt-4o-2024-05-13 
Temperature:  0.1


In [3]:
agent.run(prompt22)

Thought: To address the question, I need to first download the PDB file for the protein 1A3N and then clean it by removing heterogens, adding missing atoms and hydrogens, replacing nonstandard residues, and removing water.

Action: 
```
{
  "action": "PDBFileDownloader",
  "action_input": "1A3N"
}
```PDB file found with this ID: 1A3N


2024-07-16 21:53:15.239 
  command:

    streamlit run /opt/anaconda3/envs/mda_jun15/lib/python3.12/site-packages/ipykernel_launcher.py [ARGUMENTS]


Now that the PDB file for protein 1A3N has been downloaded, the next step is to clean the file by removing heterogens, adding missing atoms and hydrogens, replacing nonstandard residues, and removing water.

Action:
```
{
  "action": "CleaningToolFunction",
  "action_input": {
    "pdb_id": "1A3N_215315",
    "replace_nonstandard_residues": true,
    "add_missing_atoms": true,
    "remove_heterogens": true,
    "remove_water": true,
    "add_hydrogens": true
  }
}
```file path: /Users/samanthacox/Desktop/md-agent/ckpt/ckpt_95/pdb/1A3N_raw_20240716_215315.pdb
Final Answer: The PDB file for protein 1A3N has been successfully downloaded and cleaned. The cleaned file has the following modifications:
- Nonstandard residues have been replaced with standard ones.
- Missing atoms have been added.
- Heterogens have been removed.
- Water molecules have been removed.
- Hydrogens have been added.

The cleaned file ID is: 1A3N_215326.Prompt: The agent is tasked with downloading and cleaning the PDB

('Final Answer: The PDB file for protein 1A3N has been successfully downloaded and cleaned. The cleaned file has the following modifications:\n- Nonstandard residues have been replaced with standard ones.\n- Missing atoms have been added.\n- Heterogens have been removed.\n- Water molecules have been removed.\n- Hydrogens have been added.\n\nThe cleaned file ID is: 1A3N_215326.',
 'D5L0M8VL')

In [4]:
now = datetime.datetime.now()
date = now.strftime("%Y-%m-%d")
print("date and time:",date)
time = now.strftime("%H:%M:%S")
print("time:",time)
registry = agent.path_registry
paths_and_descriptions = registry.list_path_names_and_descriptions()
print("\n".join(paths_and_descriptions.split(",")))

date and time: 2024-07-16
time: 21:53:32
Files found in registry: 1A3N_215315: PDB file downloaded from RSCB
 PDBFile ID: 1A3N_215315
 1A3N_215326: Cleaned File:  Removed Heterogens
 and Water Removed.  Replaced Nonstandard Residues. Added Hydrogens at pH 7.0. Missing Atoms Added and replaces nonstandard residues. 


In [6]:
path_before = registry.get_mapped_path("1A3N_215315")
path_after = registry.get_mapped_path("1A3N_215326")
assert os.path.exists(path_before)
assert os.path.exists(path_after)
assert path_before != path_after

In [7]:
import mdtraj as md
raw_file = md.load(path_before)
clean_file = md.load(path_after)

In [8]:
#residues before
residues_before = set([str(res)[:3] for res in raw_file.top.residues])
residues_after = set([str(res)[:3] for res in clean_file.top.residues])
print("Residues before:",residues_before)
print("Residues after:",residues_after)
print("Residues removed:",residues_before - residues_after)

Residues before: {'HOH', 'ALA', 'TRP', 'TYR', 'ASP', 'LYS', 'ARG', 'ASN', 'GLN', 'PHE', 'LEU', 'CYS', 'HIS', 'GLY', 'VAL', 'GLU', 'SER', 'MET', 'THR', 'HEM', 'PRO'}
Residues after: {'ALA', 'TRP', 'TYR', 'ASP', 'LYS', 'ARG', 'ASN', 'GLN', 'PHE', 'LEU', 'CYS', 'HIS', 'GLY', 'VAL', 'GLU', 'SER', 'MET', 'THR', 'PRO'}
Residues removed: {'HOH', 'HEM'}


# Experiment Result:
### Completed without Exception or TimeOut Errors ✅
### Attempted all necessary steps ✅
### Logic make sense ✅
### Correct Answer  ✅