In [1]:
import datetime
import os
from mdagent import MDAgent
import matplotlib.pyplot as plt

In [2]:
prompt3 = "Download the PDB file for PDB ID 1GZX. Then, analyze the secondary structure of the protein and tell me how many chains, sheets, etc. there are."
llm_var = "gpt-4o-2024-05-13"
tools = "all"
agent = MDAgent(agent_type="Structured", model=llm_var, top_k_tools=tools)
now = datetime.datetime.now()
date = now.strftime("%Y-%m-%d")
print("date and time:",date)
time = now.strftime("%H:%M:%S")
print("time:",time)
print("LLM: ",agent.llm.model_name,"\nTemperature: ",agent.llm.temperature)

date and time: 2024-07-16
time: 20:00:36
LLM:  gpt-4o-2024-05-13 
Temperature:  0.1


In [3]:
agent.run(prompt3)

Thought: To address the question, I need to first download the PDB file for PDB ID 1GZX. After downloading, I will analyze the secondary structure of the protein to determine the number of chains, sheets, and other structural elements.

Action: 
```
{
  "action": "PDBFileDownloader",
  "action_input": "1GZX"
}
```PDB file found with this ID: 1GZX


2024-07-16 20:00:39.004 
  command:

    streamlit run /opt/anaconda3/envs/mda_jun15/lib/python3.12/site-packages/ipykernel_launcher.py [ARGUMENTS]


Now that the PDB file for 1GZX has been downloaded, the next step is to analyze the secondary structure of the protein to determine the number of chains, sheets, and other structural elements.

Action:
```
{
  "action": "ComputeDSSP",
  "action_input": {
    "traj_file": "1GZX_200039"
  }
}
```The secondary structure analysis has provided the counts for helices, strands, coils, and non-protein residues. However, it did not provide the number of chains. To get the number of chains, I will use the `SummarizeProteinStructure` tool.

Action:
```
{
  "action": "SummarizeProteinStructure",
  "action_input": {
    "traj_file": "1GZX_200039"
  }
}
```
The analysis of the secondary structure and the summary of the protein structure for PDB ID 1GZX is complete. Here are the details:

- Number of Chains: 12
- Number of Helices: 444
- Number of Strands: 0
- Number of Coils: 130
- Non-protein Residues: 213

Final Answer: The PDB ID 1GZX protein has 12 chains, 444 helices, 0 strands, 130 coils, and 

('The analysis of the secondary structure and the summary of the protein structure for PDB ID 1GZX is complete. Here are the details:\n\n- Number of Chains: 12\n- Number of Helices: 444\n- Number of Strands: 0\n- Number of Coils: 130\n- Non-protein Residues: 213\n\nFinal Answer: The PDB ID 1GZX protein has 12 chains, 444 helices, 0 strands, 130 coils, and 213 non-protein residues.',
 '3Q4R8IC7')

In [4]:
now = datetime.datetime.now()
date = now.strftime("%Y-%m-%d")
print("date and time:",date)
time = now.strftime("%H:%M:%S")
print("time:",time)

date and time: 2024-07-16
time: 20:10:03


In [5]:
registry = agent.path_registry
print(registry.list_path_names_and_descriptions())

Files found in registry: 1GZX_200039: PDB file downloaded from RSCB, PDBFile ID: 1GZX_200039, rec0_200044: dssp values for trajectory with id: 1GZX_200039


In [9]:
import mdtraj as md
path = registry.get_mapped_path("1GZX_200039")
traj = md.load(path)
top = traj.topology
number_of_chains = top.n_chains
secondary_structure = md.compute_dssp(traj,simplified=True)
print("Number of residues in chains: ",number_of_chains)
print("Number of residues in sheets: ",len([i for i in secondary_structure[0] if i == 'E']))
print("Number of residues in helices: ",len([i for i in secondary_structure[0] if i == 'H']))
print("Number of residues in coils: ",len([i for i in secondary_structure[0] if i == 'C']))


Number of residues in chains:  12
Number of residues in sheets:  0
Number of residues in helices:  444
Number of residues in coils:  130


# Experiment Result:
### Completed without Exception or TimeOut Errors ✅
### Attempted all necessary steps ✅
### Logic make sense ✅
### Correct Answer  ✅