In [1]:
import datetime
from mdcrow.utils import _make_llm

In [2]:
prompt = "Write me a code to run simulation of fibronectin."
model = "gpt-4o-2024-08-06"

In [3]:
# Parameters
prompt = "What are the common parameters used to simulate fibronectin?"


In [4]:
llm = _make_llm(model, temp=0.1, streaming=True)

system_prompt = (
    "You are an expert molecular dynamics scientist, and your "
    "task is to respond to the question or "
    "solve the problem in its entirety to the best of your ability. "
    "If any part of the task requires you to perform an action that "
    "you are not capable of completing, please write a runnable "
    "Python script for that step and move on. For literature papers, "
    "use and process papers from the `paper_collection` folder. "
    "For .pdb files, download them from the RSCB website using `requests`. "
    "To preprocess PDB files, you will use PDBFixer. "
    "To get information about proteins, retrieve data from the UniProt database. "
    "For anything related to simulations, you will use OpenMM, "
    "and for anything related to analyses, you will use MDTraj. "
    "At the end, combine any scripts into one script. "
)
messages = [
    ("system", system_prompt),
    ("human", prompt),
]

now = datetime.datetime.now()
date = now.strftime("%Y-%m-%d")
print("date:",date)
time = now.strftime("%H:%M:%S")
print("time:",time)
print("LLM: ",llm.model_name,"\nTemperature: ",llm.temperature)

date: 2024-10-16
time: 20:41:48
LLM:  gpt-4o-2024-08-06 
Temperature:  0.1


In [5]:
ai_msg = llm.invoke(messages)
print(ai_msg.content)

Sim

ulating

 fib

ron

ectin

,

 a

 large

 extracellular

 matrix

 protein

,

 involves

 several

 key

 parameters

 and

 considerations

.

 Here

 are

 the

 common

 parameters

 and

 steps

 involved

 in

 setting

 up

 a

 molecular

 dynamics

 simulation

 for

 fib

ron

ectin

:



1

.

 **

Force

 Field

 Selection

**

:

 


  

 -

 Choose

 an

 appropriate

 force

 field

 for

 proteins

,

 such

 as

 AM

BER

,

 CHAR

MM

,

 or

 O

PL

S

-AA

.

 These

 force

 fields

 provide

 parameters

 for

 bonded

 and

 non

-b

ond

ed

 interactions

.



2

.

 **

Initial

 Structure

**

:


  

 -

 Obtain

 the

 initial

 structure

 of

 fib

ron

ectin

 from

 the

 Protein

 Data

 Bank

 (

P

DB

).

 If

 the

 structure

 is

 incomplete

 or

 has

 missing

 atoms

,

 use

 tools

 like

 P

DB

Fix

er

 to

 repair

 it

.



3

.

 **

Sol

vation

**

:


  

 -

 Sol

vate

 the

 protein

 in

 a

 water

 box

.

 Typically

,

 TIP

3

P

 or

 SPC

/E

 water

 models

 are

 used

.

 The

 size

 of

 the

 water

 box

 should

 be

 large

 enough

 to

 accommodate

 the

 protein

 and

 allow

 for

 periodic

 boundary

 conditions

.



4

.

 **

I

ons

 and

 Neutral

ization

**

:


  

 -

 Add

 ions

 (

e

.g

.,

 Na

+

 and

 Cl

-)

 to

 neutral

ize

 the

 system

 and

 mimic

 physiological

 conditions

.

 The

 ionic

 concentration

 can

 be

 set

 to

 a

 specific

 value

,

 such

 as

 

0

.

15

 M

,

 to

 represent

 physiological

 salt

 conditions

.



5

.

 **

Temperature

 and

 Pressure

**

:


  

 -

 Set

 the

 simulation

 temperature

 to

 physiological

 conditions

,

 usually

 around

 

300

 K

 (

27

°C

).

 Use

 a

 thermostat

 like

 Lange

vin

 dynamics

 to

 maintain

 the

 temperature

.


  

 -

 If

 pressure

 control

 is

 needed

,

 use

 a

 bar

ostat

 to

 maintain

 the

 pressure

 at

 

1

 atm

.



6

.

 **

Time

 Step

 and

 Simulation

 Length

**

:


  

 -

 Use

 a

 time

 step

 of

 

1

-

2

 fs

 for

 integration

.

 The

 total

 simulation

 time

 can

 vary

 depending

 on

 the

 research

 question

 but

 typically

 ranges

 from

 nan

oseconds

 to

 micro

seconds

.



7

.

 **

Equ

il

ibration

 and

 Production

 Runs

**

:


  

 -

 Perform

 energy

 minim

ization

 to

 relieve

 any

 ster

ic

 clashes

 or

 strains

 in

 the

 initial

 structure

.


  

 -

 Equ

ilibr

ate

 the

 system

 gradually

 by

 slowly

 heating

 it

 to

 the

 target

 temperature

 and

 allowing

 the

 solvent

 and

 ions

 to

 relax

 around

 the

 protein

.


  

 -

 Conduct

 production

 runs

 to

 collect

 data

 for

 analysis

.



8

.

 **

Boundary

 Conditions

**

:


  

 -

 Apply

 periodic

 boundary

 conditions

 to

 simulate

 an

 infinite

 system

 and

 reduce

 edge

 effects

.



9

.

 **

Analysis

**

:


  

 -

 Analyze

 the

 trajectory

 data

 using

 tools

 like

 MDT

raj

 to

 study

 protein

 dynamics

,

 conform

ational

 changes

,

 and

 interactions

 with

 other

 molecules

.



Here's

 a

 basic

 Python

 script

 outline

 to

 set

 up

 a

 simulation

 using

 Open

MM

:



```

python




from

 sim

tk

.open

mm

 import

 app




from

 sim

tk

.open

mm

 import

 *


from

 sim

tk

.unit

 import

 *



#

 Load

 the

 P

DB

 file




p

db

 =

 app

.P

DB

File

('

fib

ron

ectin

.p

db

')



#

 Choose

 a

 force

 field




force

field

 =

 app

.F

orce

Field

('

amber

99

sb

.xml

',

 '

tip

3

p

.xml

')



#

 Create

 a

 system

 with

 a

 water

 box




mod

eller

 =

 app

.Mod

eller

(p

db

.top

ology

,

 pdb

.positions

)


mod

eller

.add

Sol

vent

(force

field

,

 model

='

tip

3

p

',

 padding

=

1

.

0

*

nan

ometers

)



#

 Create

 a

 system

 object




system

 =

 force

field

.create

System

(mod

eller

.top

ology

,

 non

bond

ed

Method

=

app

.P

ME

,


                                

 non

bond

ed

Cut

off

=

1

.

0

*

nan

ometers

,

 constraints

=

app

.H

B

onds

)



#

 Add

 a

 thermostat

 and

 bar

ostat




system

.add

Force

(

And

ersen

Ther

most

at

(

300

*

kel

vin

,

 

1

/p

ic

ose

cond

))


system

.add

Force

(M

onte

Car

lo

Bar

ostat

(

1

*

at

mos

phere

,

 

300

*

kel

vin

))



#

 Set

 up

 the

 integr

ator




integr

ator

 =

 Lange

vin

Integrator

(

300

*

kel

vin

,

 

1

/p

ic

ose

cond

,

 

0

.

002

*

pic

oseconds

)



#

 Create

 a

 simulation

 object




simulation

 =

 app

.S

imulation

(mod

eller

.top

ology

,

 system

,

 integr

ator

)



#

 Set

 initial

 positions




simulation

.context

.set

Positions

(mod

eller

.positions

)



#

 Min

imize

 energy




simulation

.min

imize

Energy

()



#

 Equ

ilibr

ate




simulation

.context

.set

Vel

ocities

To

Temperature

(

300

*

kel

vin

)


simulation

.step

(

100

00

)

 

 #

 Equ

il

ibration

 steps





#

 Run

 production

 simulation




simulation

.report

ers

.append

(app

.D

CD

Reporter

('

trajectory

.d

cd

',

 

100

0

))


simulation

.report

ers

.append

(app

.State

Data

Reporter

('

output

.log

',

 

100

0

,

 step

=True

,


   

 potential

Energy

=True

,

 temperature

=True

))


simulation

.step

(

500

000

)

 

 #

 Production

 steps




``

`



This

 script

 provides

 a

 basic

 framework

 for

 setting

 up

 a

 molecular

 dynamics

 simulation

 of

 fib

ron

ectin

 using

 Open

MM

.

 You

 may

 need

 to

 adjust

 parameters

 and

 settings

 based

 on

 your

 specific

 research

 needs

.

Simulating fibronectin, a large extracellular matrix protein, involves several key parameters and considerations. Here are the common parameters and steps involved in setting up a molecular dynamics simulation for fibronectin:

1. **Force Field Selection**: 
   - Choose an appropriate force field for proteins, such as AMBER, CHARMM, or OPLS-AA. These force fields provide parameters for bonded and non-bonded interactions.

2. **Initial Structure**:
   - Obtain the initial structure of fibronectin from the Protein Data Bank (PDB). If the structure is incomplete or has missing atoms, use tools like PDBFixer to repair it.

3. **Solvation**:
   - Solvate the protein in a water box. Typically, TIP3P or SPC/E water models are used. The size of the water box should be large enough to accommodate the protein and allow for periodic boundary conditions.

4. **Ions and Neutralization**:
   - Add ions (e.g., Na+ and Cl-) to neutralize the system and mimic physiological conditions. The ionic concent

In [6]:
now = datetime.datetime.now()
date = now.strftime("%Y-%m-%d")
print("date:",date)
time = now.strftime("%H:%M:%S")
print("time:",time)

date: 2024-10-16
time: 20:42:00
